Physical Reads are Very Fast, Why is this SQL Statement Slow

15 03 2010

March 15, 2010

How would you troubleshoot the following situation – what do you know, or believe to be true with at least a 95% confidence about this situation?  You encounter a SQL statement that is taking much longer to execute than is expected.  You enable an extended SQL trace at level 8 and pass the resulting trace file though TKPROF.  In the TKPROF output you find the SQL statement that is of interest:

SELECT          
      COUNT ( * )
  FROM            I_JOURNAL m
               INNER JOIN
                  LIU_TYPES lt
               ON (m.LIU_TYPE = lt.LIU_TYPE)
            INNER JOIN
               LAWFUL_I li
            ON (m.LIID = li.LIID)
         LEFT OUTER JOIN
            PHONE_BOOK pb
         ON (    m.NORM_CIN = pb.NORM_CIN
             AND pb.DELETION_DATE IS NULL
             AND pb.OPERATOR_ID = :"SYS_B_00")
 WHERE   LIU_PRIORITY >= :"SYS_B_01"
         AND (li.ID IN
                    (:"SYS_B_02",
                     :"SYS_B_03",
                     :"SYS_B_04",
                     :"SYS_B_05",
                     :"SYS_B_06",
                     :"SYS_B_07",
                     :"SYS_B_08",
                     :"SYS_B_09",
[...]
                     :"SYS_B_59",
                     :"SYS_B_60",
                     :"SYS_B_61",
                     :"SYS_B_62"))
         AND (li.END_VALID_DATE IS NULL
              OR m.DISPLAY_DATE <= li.END_VALID_DATE)
         AND li.OPERATOR_ID = :"SYS_B_63"

In the above, roughly 50 lines of the SQL statement containing bind variables for the IN list were removed to save space.  The summary information for the SQL statement’s execution follows:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.05       0.04          0          0          0           0
Fetch        2    105.00     102.69     283093     652774          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4    105.05     102.73     283093     652774          0           1

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 50 

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=652774 pr=283093 pw=0 time=102690302 us)
 888030   HASH JOIN RIGHT OUTER (cr=652774 pr=283093 pw=0 time=99929786 us)
  28488    INDEX RANGE SCAN OBJ#(45035) (cr=130 pr=0 pw=0 time=142564 us)(object id 45035)         PBK_GET_NORM_CIN_UK
 888030    HASH JOIN  (cr=652644 pr=283093 pw=0 time=85254971 us)
     15     TABLE ACCESS FULL OBJ#(44893) (cr=7 pr=0 pw=0 time=320 us)                             LIU_TYPES
 888117     TABLE ACCESS BY INDEX ROWID OBJ#(47625) (cr=652637 pr=283093 pw=0 time=63945559 us)    I_JOURNAL
 888179      NESTED LOOPS  (cr=5389 pr=4986 pw=0 time=23801052 us)
     61       INLIST ITERATOR  (cr=213 pr=1 pw=0 time=8299 us)
     61        TABLE ACCESS BY INDEX ROWID OBJ#(44860) (cr=213 pr=1 pw=0 time=7235 us)             LAWFUL_I
     61         INDEX RANGE SCAN OBJ#(45023) (cr=122 pr=0 pw=0 time=2454 us)(object id 45023)      LIN_ID_UK
 888117       INDEX RANGE SCAN OBJ#(52001) (cr=5176 pr=4985 pw=0 time=9904545 us)(object id 52001) IJL_LIN_FK_IX

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                    283093        0.02          8.51
  SQL*Net message from client                     2        0.00          0.00

While reviewing the above you notice that there were 652,774 consistent gets and 283,093 physical block reads.  The wait events associated with the SQL statement indicate that there were 283,093 waits on the db file sequential read wait event, which indicates that all of the physical block reads were performed one block at a time (the Times Waited statistic for this event matches the disk statistic in the Fetch summary line).  If a total of 8.51 seconds (from the Total Waited statistic for the db file sequential read wait event) were needed to complete those 283,093 single block reads, then on average each single block read required approximately 0.000030 seconds – roughly 133 times faster than one revolution of a fast 15,000 RPM hard drive.

If we experiment with the average single block read time, there were 4,985 single block reads of object ID 52001 (the IJL_LIN_FK_IX index), which suggests that the total time for those physical reads to complete was 4,985 * 0.000030 seconds = 0.15 seconds.  There were 5,176 logical reads for that operation, which is just slightly more than the number of physical blocks that were read from disk for that operation.  The time= statistic for that line in the plan indicates that the operation required 9.90 seconds.  Keeping in mind the other wait events, what happened during the 9.75 seconds of that operation that did not involve the physical block reads?  Notice that the TKPROF output shows that the query statistics summary indicated that the query consumed 105.05 seconds of CPU time with an elapsed query time of 102.73 seconds.

The DBMS_XPLAN output for the SQL statement looks like this:

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                       |       |       | 28440 (100)|          |
|   1 |  SORT AGGREGATE                   |                       |     1 |   130 |            |          |
|*  2 |   HASH JOIN RIGHT OUTER           |                       | 29838 |  3788K| 28440   (1)| 00:05:42 |
|*  3 |    INDEX RANGE SCAN               | PBK_GET_NORM_CIN_UK   | 25437 |   571K|   130   (0)| 00:00:02 |
|*  4 |    HASH JOIN                      |                       | 29838 |  3117K| 28399   (1)| 00:05:41 |
|*  5 |     TABLE ACCESS FULL             | LIU_TYPES             |    16 |   160 |     3   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS BY INDEX ROWID   | I_JOURNAL             |  1172 | 58600 |  1091   (0)| 00:00:14 |
|   7 |      NESTED LOOPS                 |                       | 30659 |  2904K| 28396   (1)| 00:05:41 |
|   8 |       INLIST ITERATOR             |                       |       |       |            |          |
|*  9 |        TABLE ACCESS BY INDEX ROWID| LAWFUL_I              |    26 |  1222 |    10   (0)| 00:00:01 |
|* 10 |         INDEX RANGE SCAN          | LIN_ID_UK             |    61 |       |     2   (0)| 00:00:01 |
|* 11 |       INDEX RANGE SCAN            | IJL_LIN_FK_IX         |  5318 |       |    30   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("M"."NORM_CIN"="PB"."NORM_CIN")
   3 - access("PB"."OPERATOR_ID"=:SYS_B_00 AND "PB"."DELETION_DATE" IS NULL)
       filter("PB"."DELETION_DATE" IS NULL)
   4 - access("M"."LIU_TYPE"="LT"."LIU_TYPE")
   5 - filter("LT"."LIU_PRIORITY">=:SYS_B_01)
   6 - filter(("LI"."END_VALID_DATE" IS NULL OR "M"."DISPLAY_DATE"<="LI"."END_VALID_DATE"))
   9 - filter("LI"."OPERATOR_ID"=:SYS_B_63)
  10 - access(("LI"."ID"=:SYS_B_02 OR "LI"."ID"=:SYS_B_03 OR "LI"."ID"=:SYS_B_04 OR
              "LI"."ID"=:SYS_B_05 OR "LI"."ID"=:SYS_B_06 OR "LI"."ID"=:SYS_B_07 OR "LI"."ID"=:SYS_B_08 OR
              "LI"."ID"=:SYS_B_09 OR "LI"."ID"=:SYS_B_10 OR "LI"."ID"=:SYS_B_11 OR "LI"."ID"=:SYS_B_12 OR
              "LI"."ID"=:SYS_B_13 OR "LI"."ID"=:SYS_B_14 OR "LI"."ID"=:SYS_B_15 OR "LI"."ID"=:SYS_B_16 OR
              "LI"."ID"=:SYS_B_17 OR "LI"."ID"=:SYS_B_18 OR "LI"."ID"=:SYS_B_19 OR "LI"."ID"=:SYS_B_20 OR
              "LI"."ID"=:SYS_B_21 OR "LI"."ID"=:SYS_B_22 OR "LI"."ID"=:SYS_B_23 OR "LI"."ID"=:SYS_B_24 OR
              "LI"."ID"=:SYS_B_25 OR "LI"."ID"=:SYS_B_26 OR "LI"."ID"=:SYS_B_27 OR "LI"."ID"=:SYS_B_28 OR
              "LI"."ID"=:SYS_B_29 OR "LI"."ID"=:SYS_B_30 OR "LI"."ID"=:SYS_B_31 OR "LI"."ID"=:SYS_B_32 OR
              "LI"."ID"=:SYS_B_33 OR "LI"."ID"=:SYS_B_34 OR "LI"."ID"=:SYS_B_35 OR "LI"."ID"=:SYS_B_36 OR
              "LI"."ID"=:SYS_B_37 OR "LI"."ID"=:SYS_B_38 OR "LI"."ID"=:SYS_B_39 OR "LI"."ID"=:SYS_B_40 OR
              "LI"."ID"=:SYS_B_41 OR "LI"."ID"=:SYS_B_42 OR "LI"."ID"=:SYS_B_43 OR "LI"."ID"=:SYS_B_44 OR
              "LI"."ID"=:SYS_B_45 OR "LI"."ID"=:SYS_B_46 OR "LI"."ID"=:SYS_B_47 OR "LI"."ID"=:SYS_B_48 OR
              "LI"."ID"=:SYS_B_49 OR "LI"."ID"=:SYS_B_50 OR "LI"."ID"=:SYS_B_51 OR "LI"."ID"=:SYS_B_52 OR
              "LI"."ID"=:SYS_B_53 OR "LI"."ID"=:SYS_B_54 OR "LI"."ID"=:SYS_B_55 OR "LI"."ID"=:SYS_B_56 OR
              "LI"."ID"=:SYS_B_57 OR "LI"."ID"=:SYS_B_58 OR "LI"."ID"=:SYS_B_59 OR "LI"."ID"=:SYS_B_60 OR
              "LI"."ID"=:SYS_B_61 OR "LI"."ID"=:SYS_B_62))
  11 - access("M"."LIID"="LI"."LIID")

Setting the STATISTICS_LEVEL parameter at the session level to ALL, followed by executing the SQL statement permitted the following execution plan to be retrieved using DBMS_XPLAN:

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                  | E-Rows | A-Rows | Buffers | Reads  | Writes | A-Time     |
-----------------------------------------------------------------------------------------------------------------------------
|   1 | SORT AGGREGATE                   |                       |      1 |      1 |     657K|    245K|      0 |00:01:40.79 |
|*  2 |  HASH JOIN RIGHT OUTER           |                       |  30607 |    894K|     657K|    245K|      0 |00:01:37.94 |
|*  3 |   INDEX RANGE SCAN               | PBK_GET_NORM_CIN_UK   |  25437 |  28490 |     130 |     19 |      0 |00:00:00.14 |
|*  4 |   HASH JOIN                      |                       |  30607 |    894K|     657K|    245K|      0 |00:01:23.16 |
|*  5 |    TABLE ACCESS FULL             | LIU_TYPES             |     16 |     15 |       7 |      0 |      0 |00:00:00.01 |
|*  6 |    TABLE ACCESS BY INDEX ROWID   | I_JOURNAL             |   1221 |    894K|     657K|    245K|      0 |00:01:01.70 |
|   7 |     NESTED LOOPS                 |                       |  31449 |    894K|    5428 |   5099 |      0 |00:00:24.02 |
|   8 |      INLIST ITERATOR             |                       |        |     61 |     206 |      1 |      0 |00:00:00.01 |
|*  9 |       TABLE ACCESS BY INDEX ROWID| LAWFUL_I              |     26 |     61 |     206 |      1 |      0 |00:00:00.01 |
|* 10 |        INDEX RANGE SCAN          | LIN_ID_UK             |     61 |     61 |     122 |      0 |      0 |00:00:00.01 |
|* 11 |      INDEX RANGE SCAN            | IJL_LIN_FK_IX         |   5318 |    894K|    5222 |   5098 |      0 |00:00:09.99 |
-----------------------------------------------------------------------------------------------------------------------------

Assume that the database version is 10.1.0.x, and that the tables contain the following number of rows:
I_JOURNAL: 5,000,000 rows
LAWFUL_I: 1,000 rows
PHONE_BOOK: 78,000 rows

Here are some of the questions that you might try answering when indicating what you know about the above performance problem:

  • What was the server doing in the 9.75 seconds of the INDEX RANGE SCAN operation on the last line of the execution plan that probably could not be attributed to physical I/O?
  • How is it possible that 105.05 seconds of CPU time were required for a SQL statement that had a total execution time of 102.69 seconds?  What was the CPU doing for 105.05 seconds?
  • How can you explain the 0.000030 second single block read time?
  • Why are all of the bind variables named similar to SYS_B_nn?
  • Are the number of elements in the IN list contributing to the performance problem?
  • Are there any hints that you might try to force a different execution plan, and if so, is there a reason to change the execution plan?
  • Why was there a miss in the library cache during both the parse and execute calls?
  • What caused the incorrect cardinality estimates, and how would you correct the cardinality estimates?
  • Does the ANSI style join syntax have an impact on the execution time for the query?
  • Would you recommend any changes to initialization parameters, and why?
  • What needs to be done to allow the query to complete in 5 seconds?
  • How would the server’s behavior change if significant competition for the server’s RAM caused excessive use of the operating system’s page file?
  • Are the server’s CPUs a constraining resource for this SQL statement – if so, why, if not, why not?

This SQL statement and TKPROF output where part of an interesting question in the OTN forums – I attempted to share my thoughts about this SQL statement in that forum thread.








Follow

Get every new post delivered to your Inbox.

Join 144 other followers