Physical Reads are Very Fast, Why is this SQL Statement Slow

15 03 2010

March 15, 2010

How would you troubleshoot the following situation – what do you know, or believe to be true with at least a 95% confidence about this situation?  You encounter a SQL statement that is taking much longer to execute than is expected.  You enable an extended SQL trace at level 8 and pass the resulting trace file though TKPROF.  In the TKPROF output you find the SQL statement that is of interest:

SELECT          
      COUNT ( * )
  FROM            I_JOURNAL m
               INNER JOIN
                  LIU_TYPES lt
               ON (m.LIU_TYPE = lt.LIU_TYPE)
            INNER JOIN
               LAWFUL_I li
            ON (m.LIID = li.LIID)
         LEFT OUTER JOIN
            PHONE_BOOK pb
         ON (    m.NORM_CIN = pb.NORM_CIN
             AND pb.DELETION_DATE IS NULL
             AND pb.OPERATOR_ID = :"SYS_B_00")
 WHERE   LIU_PRIORITY >= :"SYS_B_01"
         AND (li.ID IN
                    (:"SYS_B_02",
                     :"SYS_B_03",
                     :"SYS_B_04",
                     :"SYS_B_05",
                     :"SYS_B_06",
                     :"SYS_B_07",
                     :"SYS_B_08",
                     :"SYS_B_09",
[...]
                     :"SYS_B_59",
                     :"SYS_B_60",
                     :"SYS_B_61",
                     :"SYS_B_62"))
         AND (li.END_VALID_DATE IS NULL
              OR m.DISPLAY_DATE <= li.END_VALID_DATE)
         AND li.OPERATOR_ID = :"SYS_B_63"

In the above, roughly 50 lines of the SQL statement containing bind variables for the IN list were removed to save space.  The summary information for the SQL statement’s execution follows:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.05       0.04          0          0          0           0
Fetch        2    105.00     102.69     283093     652774          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4    105.05     102.73     283093     652774          0           1

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 50 

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=652774 pr=283093 pw=0 time=102690302 us)
 888030   HASH JOIN RIGHT OUTER (cr=652774 pr=283093 pw=0 time=99929786 us)
  28488    INDEX RANGE SCAN OBJ#(45035) (cr=130 pr=0 pw=0 time=142564 us)(object id 45035)         PBK_GET_NORM_CIN_UK
 888030    HASH JOIN  (cr=652644 pr=283093 pw=0 time=85254971 us)
     15     TABLE ACCESS FULL OBJ#(44893) (cr=7 pr=0 pw=0 time=320 us)                             LIU_TYPES
 888117     TABLE ACCESS BY INDEX ROWID OBJ#(47625) (cr=652637 pr=283093 pw=0 time=63945559 us)    I_JOURNAL
 888179      NESTED LOOPS  (cr=5389 pr=4986 pw=0 time=23801052 us)
     61       INLIST ITERATOR  (cr=213 pr=1 pw=0 time=8299 us)
     61        TABLE ACCESS BY INDEX ROWID OBJ#(44860) (cr=213 pr=1 pw=0 time=7235 us)             LAWFUL_I
     61         INDEX RANGE SCAN OBJ#(45023) (cr=122 pr=0 pw=0 time=2454 us)(object id 45023)      LIN_ID_UK
 888117       INDEX RANGE SCAN OBJ#(52001) (cr=5176 pr=4985 pw=0 time=9904545 us)(object id 52001) IJL_LIN_FK_IX

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                    283093        0.02          8.51
  SQL*Net message from client                     2        0.00          0.00

While reviewing the above you notice that there were 652,774 consistent gets and 283,093 physical block reads.  The wait events associated with the SQL statement indicate that there were 283,093 waits on the db file sequential read wait event, which indicates that all of the physical block reads were performed one block at a time (the Times Waited statistic for this event matches the disk statistic in the Fetch summary line).  If a total of 8.51 seconds (from the Total Waited statistic for the db file sequential read wait event) were needed to complete those 283,093 single block reads, then on average each single block read required approximately 0.000030 seconds – roughly 133 times faster than one revolution of a fast 15,000 RPM hard drive.

If we experiment with the average single block read time, there were 4,985 single block reads of object ID 52001 (the IJL_LIN_FK_IX index), which suggests that the total time for those physical reads to complete was 4,985 * 0.000030 seconds = 0.15 seconds.  There were 5,176 logical reads for that operation, which is just slightly more than the number of physical blocks that were read from disk for that operation.  The time= statistic for that line in the plan indicates that the operation required 9.90 seconds.  Keeping in mind the other wait events, what happened during the 9.75 seconds of that operation that did not involve the physical block reads?  Notice that the TKPROF output shows that the query statistics summary indicated that the query consumed 105.05 seconds of CPU time with an elapsed query time of 102.73 seconds.

The DBMS_XPLAN output for the SQL statement looks like this:

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                       |       |       | 28440 (100)|          |
|   1 |  SORT AGGREGATE                   |                       |     1 |   130 |            |          |
|*  2 |   HASH JOIN RIGHT OUTER           |                       | 29838 |  3788K| 28440   (1)| 00:05:42 |
|*  3 |    INDEX RANGE SCAN               | PBK_GET_NORM_CIN_UK   | 25437 |   571K|   130   (0)| 00:00:02 |
|*  4 |    HASH JOIN                      |                       | 29838 |  3117K| 28399   (1)| 00:05:41 |
|*  5 |     TABLE ACCESS FULL             | LIU_TYPES             |    16 |   160 |     3   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS BY INDEX ROWID   | I_JOURNAL             |  1172 | 58600 |  1091   (0)| 00:00:14 |
|   7 |      NESTED LOOPS                 |                       | 30659 |  2904K| 28396   (1)| 00:05:41 |
|   8 |       INLIST ITERATOR             |                       |       |       |            |          |
|*  9 |        TABLE ACCESS BY INDEX ROWID| LAWFUL_I              |    26 |  1222 |    10   (0)| 00:00:01 |
|* 10 |         INDEX RANGE SCAN          | LIN_ID_UK             |    61 |       |     2   (0)| 00:00:01 |
|* 11 |       INDEX RANGE SCAN            | IJL_LIN_FK_IX         |  5318 |       |    30   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("M"."NORM_CIN"="PB"."NORM_CIN")
   3 - access("PB"."OPERATOR_ID"=:SYS_B_00 AND "PB"."DELETION_DATE" IS NULL)
       filter("PB"."DELETION_DATE" IS NULL)
   4 - access("M"."LIU_TYPE"="LT"."LIU_TYPE")
   5 - filter("LT"."LIU_PRIORITY">=:SYS_B_01)
   6 - filter(("LI"."END_VALID_DATE" IS NULL OR "M"."DISPLAY_DATE"<="LI"."END_VALID_DATE"))
   9 - filter("LI"."OPERATOR_ID"=:SYS_B_63)
  10 - access(("LI"."ID"=:SYS_B_02 OR "LI"."ID"=:SYS_B_03 OR "LI"."ID"=:SYS_B_04 OR
              "LI"."ID"=:SYS_B_05 OR "LI"."ID"=:SYS_B_06 OR "LI"."ID"=:SYS_B_07 OR "LI"."ID"=:SYS_B_08 OR
              "LI"."ID"=:SYS_B_09 OR "LI"."ID"=:SYS_B_10 OR "LI"."ID"=:SYS_B_11 OR "LI"."ID"=:SYS_B_12 OR
              "LI"."ID"=:SYS_B_13 OR "LI"."ID"=:SYS_B_14 OR "LI"."ID"=:SYS_B_15 OR "LI"."ID"=:SYS_B_16 OR
              "LI"."ID"=:SYS_B_17 OR "LI"."ID"=:SYS_B_18 OR "LI"."ID"=:SYS_B_19 OR "LI"."ID"=:SYS_B_20 OR
              "LI"."ID"=:SYS_B_21 OR "LI"."ID"=:SYS_B_22 OR "LI"."ID"=:SYS_B_23 OR "LI"."ID"=:SYS_B_24 OR
              "LI"."ID"=:SYS_B_25 OR "LI"."ID"=:SYS_B_26 OR "LI"."ID"=:SYS_B_27 OR "LI"."ID"=:SYS_B_28 OR
              "LI"."ID"=:SYS_B_29 OR "LI"."ID"=:SYS_B_30 OR "LI"."ID"=:SYS_B_31 OR "LI"."ID"=:SYS_B_32 OR
              "LI"."ID"=:SYS_B_33 OR "LI"."ID"=:SYS_B_34 OR "LI"."ID"=:SYS_B_35 OR "LI"."ID"=:SYS_B_36 OR
              "LI"."ID"=:SYS_B_37 OR "LI"."ID"=:SYS_B_38 OR "LI"."ID"=:SYS_B_39 OR "LI"."ID"=:SYS_B_40 OR
              "LI"."ID"=:SYS_B_41 OR "LI"."ID"=:SYS_B_42 OR "LI"."ID"=:SYS_B_43 OR "LI"."ID"=:SYS_B_44 OR
              "LI"."ID"=:SYS_B_45 OR "LI"."ID"=:SYS_B_46 OR "LI"."ID"=:SYS_B_47 OR "LI"."ID"=:SYS_B_48 OR
              "LI"."ID"=:SYS_B_49 OR "LI"."ID"=:SYS_B_50 OR "LI"."ID"=:SYS_B_51 OR "LI"."ID"=:SYS_B_52 OR
              "LI"."ID"=:SYS_B_53 OR "LI"."ID"=:SYS_B_54 OR "LI"."ID"=:SYS_B_55 OR "LI"."ID"=:SYS_B_56 OR
              "LI"."ID"=:SYS_B_57 OR "LI"."ID"=:SYS_B_58 OR "LI"."ID"=:SYS_B_59 OR "LI"."ID"=:SYS_B_60 OR
              "LI"."ID"=:SYS_B_61 OR "LI"."ID"=:SYS_B_62))
  11 - access("M"."LIID"="LI"."LIID")

Setting the STATISTICS_LEVEL parameter at the session level to ALL, followed by executing the SQL statement permitted the following execution plan to be retrieved using DBMS_XPLAN:

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                  | E-Rows | A-Rows | Buffers | Reads  | Writes | A-Time     |
-----------------------------------------------------------------------------------------------------------------------------
|   1 | SORT AGGREGATE                   |                       |      1 |      1 |     657K|    245K|      0 |00:01:40.79 |
|*  2 |  HASH JOIN RIGHT OUTER           |                       |  30607 |    894K|     657K|    245K|      0 |00:01:37.94 |
|*  3 |   INDEX RANGE SCAN               | PBK_GET_NORM_CIN_UK   |  25437 |  28490 |     130 |     19 |      0 |00:00:00.14 |
|*  4 |   HASH JOIN                      |                       |  30607 |    894K|     657K|    245K|      0 |00:01:23.16 |
|*  5 |    TABLE ACCESS FULL             | LIU_TYPES             |     16 |     15 |       7 |      0 |      0 |00:00:00.01 |
|*  6 |    TABLE ACCESS BY INDEX ROWID   | I_JOURNAL             |   1221 |    894K|     657K|    245K|      0 |00:01:01.70 |
|   7 |     NESTED LOOPS                 |                       |  31449 |    894K|    5428 |   5099 |      0 |00:00:24.02 |
|   8 |      INLIST ITERATOR             |                       |        |     61 |     206 |      1 |      0 |00:00:00.01 |
|*  9 |       TABLE ACCESS BY INDEX ROWID| LAWFUL_I              |     26 |     61 |     206 |      1 |      0 |00:00:00.01 |
|* 10 |        INDEX RANGE SCAN          | LIN_ID_UK             |     61 |     61 |     122 |      0 |      0 |00:00:00.01 |
|* 11 |      INDEX RANGE SCAN            | IJL_LIN_FK_IX         |   5318 |    894K|    5222 |   5098 |      0 |00:00:09.99 |
-----------------------------------------------------------------------------------------------------------------------------

Assume that the database version is 10.1.0.x, and that the tables contain the following number of rows:
I_JOURNAL: 5,000,000 rows
LAWFUL_I: 1,000 rows
PHONE_BOOK: 78,000 rows

Here are some of the questions that you might try answering when indicating what you know about the above performance problem:

  • What was the server doing in the 9.75 seconds of the INDEX RANGE SCAN operation on the last line of the execution plan that probably could not be attributed to physical I/O?
  • How is it possible that 105.05 seconds of CPU time were required for a SQL statement that had a total execution time of 102.69 seconds?  What was the CPU doing for 105.05 seconds?
  • How can you explain the 0.000030 second single block read time?
  • Why are all of the bind variables named similar to SYS_B_nn?
  • Are the number of elements in the IN list contributing to the performance problem?
  • Are there any hints that you might try to force a different execution plan, and if so, is there a reason to change the execution plan?
  • Why was there a miss in the library cache during both the parse and execute calls?
  • What caused the incorrect cardinality estimates, and how would you correct the cardinality estimates?
  • Does the ANSI style join syntax have an impact on the execution time for the query?
  • Would you recommend any changes to initialization parameters, and why?
  • What needs to be done to allow the query to complete in 5 seconds?
  • How would the server’s behavior change if significant competition for the server’s RAM caused excessive use of the operating system’s page file?
  • Are the server’s CPUs a constraining resource for this SQL statement – if so, why, if not, why not?

This SQL statement and TKPROF output where part of an interesting question in the OTN forums – I attempted to share my thoughts about this SQL statement in that forum thread.


Actions

Information

8 responses

15 03 2010
jametong

* What was the server doing in the 9.75 seconds of the INDEX RANGE SCAN operation on the last line of the execution plan that probably could not be attributed to physical I/O?

The CPU time mainly doing the logical IO,and Logical IO is a time consuming operation.

* How is it possible that 105.05 seconds of CPU time were required for a SQL statement that had a total execution time of 102.69 seconds? What was the CPU doing for 105.05 seconds?
can answer this ? any parallel operation?

* How can you explain the 0.000030 second single block read time?

the physical IO mainly due to read from Filesystem Cache.

* Why are all of the bind variables named similar to SYS_B_nn?

the application uses cursor_sharing = force|similar ,more possible answer is cursor_sharing = force.

* Are the number of elements in the IN list contributing to the performance problem?

I think number of elements in the IN list contribute little to the performance problem.

* Are there any hints that you might try to force a different execution plan, and if so, is there a reason to change the execution plan?

Need more detail about the data distribution to do this.

* Why was there a miss in the library cache during both the parse and execute calls?

only execute once? miss in the execution call may to due to the statement cache with a optimizer change.

* What caused the incorrect cardinality estimates, and how would you correct the cardinality estimates?

I have no answer to this.

* Does the ANSI style join syntax have an impact on the execution time for the query?
* Would you recommend any changes to initialization parameters, and why?
* What needs to be done to allow the query to complete in 5 seconds?
* How would the server’s behavior change if significant competition for the server’s RAM caused excessive use of the operating system’s page file?

CPU consumption with rise a lot, and physical io read time with raise also ,because there will not be so many blocks in FileSystem Cache again.

* Are the server’s CPUs a constraining resource for this SQL statement – if so, why, if not, why not?

I think the problem is more due to the high consistent gets ,and not due to the CPUs limitation.

15 03 2010
Narendra

I was waiting to see a blog post from you on this. 🙂
I believe jametong has answered most of your questions above and you yourself has answered many in your response on the forum. Some remaining (??) are:
How is it possible that 105.05 seconds of CPU time were required for a SQL statement that had a total execution time of 102.69 seconds? What was the CPU doing for 105.05 seconds?
The difference is simply due to the different ways oracle uses to derive CPU time and elapsed time. The elapsed time “misses” out on counting especially when there are large number of small waits (283093 waits with max wait of 0.02). I learned it from Tom.
Why was there a miss in the library cache during both the parse and execute calls?
Because SQL was executed from SQL*Plus
What caused the incorrect cardinality estimates, and how would you correct the cardinality estimates?
I would verify (and regather) statistics on I_JOURNAL table and IJL_LIN_FK_IX index, which should take care of difference in cardinality estimates.
What needs to be done to allow the query to complete in 5 seconds?
Query less data or delete data from tables ?? 🙂 On a serious note, the 5 seconds of expected execution time may not be realistic given the database version, table structures, available CPU/memory etc.
Are the server’s CPUs a constraining resource for this SQL statement – if so, why, if not, why not?
Looks like apparantly but I won’t be surprised if it turns out to be an optimizer limitation (or “bug”) if the database version is 10.1.0.x.
For me, the main suspect is I_JOURNAL table (and probably its index). If I_JOURNAL table is in a tablespace with non-default block size (16k or 32K), would that possibly explain higher CPU time ? I remember Richard Foote describing the increased CPU usage when an index was rebuilt in a 16K block tablespace as compared to a 8K block tablespace. I also thought if having LOB datatype columns in I_JOURNAL table might affect the increased CPU time but as the query is simply doing a COUNT(*), I am not really sure if it matters.
However, it certainly looks like I_JOURNAL table (and its index) access looks to be the most influencial factor (the outer hash join also does not appear to be spilling to disk so probably no issues with PGA memory settings). So, in order to isolate the issue, I would also trace following queries
1) Original query wihout I_JOURNAL table
2) A simple count(*) on I_JOURNAL table alone.

p.s. BTW, if you have noticed in the forum, there is a significant difference between the trace details (especially for I_JOURNAL table) between original query and the query with FULL hint. So it appears that OP has also changed something else, in addition to adding FULL hint. Following are key diff.
Original With Full Hint Diff
Rows 888117 5010465 4122348
CR 5389 262646 257257
PR 4986 198250 193264
Elapsed 23801052 35075329 11274277

15 03 2010
Charles Hooper

Great discussion so far – I do not wish to bring an end to the constructive comments. Deleting data is an interesting approach to making the query execute faster, but that might throw off the query results.

For points of reference, this screen shot (10.2.0.x) shows 7,407,092 consistent gets performed in 87.51 seconds, which implies a rate of about 84,643 consistent gets per second on a 4 year old Intel Xeon 3.66GHz server:

Page 308 of the Expert Oracle Practices book shows a TKPROF output with the STATISTICS_LEVEL parameter set to ALL on Oracle 10.2.0.4 where the server performed 128,958,761 consistent gets in 2857.12 seconds, which implies a rate of 45,136 consistent gets per second on a 2.93 GHz Intel Nehalem architecture CPU. The number of consistent gets with the Nehalem CPU jumps significantly on Oracle 10.2.0.4 when the STATISTICS_LEVEL is set to TYPICAL.

Edit: I want to repeat that the above are just points of reference for how many logical I/Os might be performed per second.

16 03 2010
Narendra

Charles,

Are those figures (logical I/Os per second) based on reading one block at a time (typically indexed access) or multiple blocks at a time (typically full scan) ?
Another reason came to mind about increased CPU usage. Table and/or index compression used on I_JOURNAL table (or its index) ? Although not sure how much more CPU the compression will cause.

16 03 2010
Charles Hooper

Narendra,

The 10.2.0.x reference refers to consistent gets that were mostly caused by full table scans.

The reference on page 308 of the Expert Oracle Practices book is from a TKPROF summary where the execution plan consisted mainly of nested loop joins with index lookups and one Cartesian join that was caused by statistics that were a couple of hours out of date. The execution time summary that we included on page 313 of the book for the same SQL statement shows 66 minutes and 44.31 seconds for the query execution when the STATISTICS_LEVEL parameter was set to ALL on Oracle 10.2.0.4 and 6 minutes and 56.73 seconds for the query execution with STATISTICS_LEVEL set to TYPICAL on Oracle 10.2.0.4. If we recalculate the number of consistent gets using those numbers, at the setting of ALL the server processed an average of 32,205 consistent gets per second. With the setting at TYPICAL the server performed an average of 309,454 consistent gets per second.

I had not considered the possibility of index compression. I wonder if that would cause a significant change in the CPU usage? I see that Richard Foote has a blog series on index compression: http://richardfoote.wordpress.com/2008/02/17/index-compression-part-i-low/
I also had not considered the possibility of table compression or column level encryption (or for that matter data encryption between the server and client): http://download.oracle.com/docs/cd/E11882_01/server.112/e10595/tables002.htm#insertedID6

But it also makes me wonder if more than 1 session, maybe 500, were all trying to execute this or a similar SQL statement at the same time the OP was attempting to trace the execution – that could effectively reduce the upper limit for the number of consistent gets for the traced session by a factor of 500 or more (while RAM is fast, there is an upper limit to RAM speed).

23 03 2010
Narendra

Charles,

I was hoping you would throw some light on the case in question (either on forum or here). So what can be the reason for the CPU time? Or is that expected?

23 03 2010
Charles Hooper

Narendra,

I have been tied up a bit lately with non-Oracle things – I am giving a presentation about methods of custom report development to an ERP user’s group tomorrow. I will try to provide a couple more comments about the thread in a couple of days.

It might be interesting to find out if the OP is running one of those special Sun servers with the T2000 CPUs (I think that is the number), where each thread receives an effective 300MHz of CPU capacity – that might explain the low number of consistent gets per second.

There are still some questions which were raised in this blog article that the OP would need to answer to avoid this article from slipping into just a guessing game – a lot of logical reasoning has gone into all of the comments, which I am very happy to see.

It could be a problem with:
* Statistics level set to ALL
* Exceeding the RAM memory in the server, causing a lot of swapping to disk. I believe that someone correctly mentioned in the comments of this blog article that the OP would likely lose the benefit of the OS file cache if the server were really starved for RAM. That is probably true, unless – it is a huge SAN cache that is actually allowing the 0.000030 average single block read time, or the data resides on SSD hardware (likely also with a huge cache), or maybe the OP is doing something scary – like having the data files located in a RAM drive (and in the process consuming a lot of physical RAM).
* Maybe 500 users are all simultaneously trying to execute this SQL statement, or another inefficient statement.
* The number of consistent gets is high enough that I think an execution time of 5 seconds is very unlikely even under ideal conditions. 7 seconds might be possible with a fast CPU and fast access to RAM. Of course we do not know if the database is using 2KB blocks, 32KB blocks, or somewhere in between.
* I think that some of the people who responded in the thread are on the right track – the SQL statement needs to be changed because it currently will not scale to the performance needed by the OP. I see that Iggy Fernandez suggested a materialized view – that might be a step in the right direction, or maybe a more complete change in the data model is needed.

22 02 2013
Log Buffer #183, A Carnival of the Vanities for DBAs

[…] are actually more efficient for solving certain types of problems. In another excellent post titled Physical Reads are Very Fast, Why is this SQL Statement Slow Charles also dives into why a slow query is “actually” slow. There’s an […]

Leave a reply to jametong Cancel reply