Simple Query Generates Complex Execution Plan, the Mysterious 4063.88 Second Single Block Read Wait

22 01 2010

January 22, 2010

There was an interesting thread on Oracle’s OTN forums in April 2008:
http://forums.oracle.com/forums/thread.jspa?threadID=642641

In the thread, the original poster wrote:

I have 4 tables. I wrote one query as follows. But it’s taking 45 mins for executing. Can you please write a query for this to improve performance. The tables are as follows:

  1. Case
  2. Item 
  3. Master 
  4. Warehouse

The (table) columns are:

Case: Item, Supply_indicator, Country_Indicator, item_size
Item: Item, location, location_type, ondate, offdate, status, create_date
Master: item, status,  pack_indicator, Item_level(either 1 or 0), Trns_level(either 1 or 0), create_date, Foreind (either Y or N)
Warehouse: Warehose_no, Name, Address

The OP posted the SQL statement and a form of an execution plan:

Comparing the query with the explan plan:

SELECT
  IM.LOCATION,
  IM.ITEM,
  CS.CASE_SIZE,
  IM.ONDATE,
  IM.OFFDATE,
  MAS.STATUS
FROM
  CASE_SIZE CS,
  ITEM IM,
  MASTER MAS,
  WAREHOUSE WH
WHERE
  MAS.PACK_IND = 'N'
  AND MAS.ITEM_LEVEL = MAS.TRNS_LEVEL
  AND MAS.STATUS = 'A'
  AND MAS.FOREIND = 'Y'
  AND MAS.ITEM = IM.ITEM
  AND IM.LOCATION_TYPE = 'S'
  AND MAS.ITEM = CS.ITEM
  AND CS.SUPPLY_INDICATOR = 'Y'
  AND CS.COUNTRY_INDICATOR = 'Y'
  AND IM.LOCATION =WH.WAREHOSE_NO
  AND NVL(WH.CLOSE_DATE,'04-APR-9999')>=TO_DATE(&VERSDATE}, 'YYYYMMDD')

Execution Plan
--------------------------------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=15085 Card=1139512 Bytes=134462416)
   1    0   HASH JOIN (Cost=15085 Card=1139512 Bytes=134462416)
   2    1     HASH JOIN (Cost=580 Card=30772 Bytes=1569372)
   3    2       TABLE ACCESS (FULL) OF 'MASTER' (Cost=63 Card=1306 Bytes=18284)
   4    2       TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'ITEM' (Cost=2 Card=105311 Bytes=3369952)
   5    4         NESTED LOOPS (Cost=86 Card=4296685 Bytes=158977345)
   6    5           TABLE ACCESS (FULL) OF 'WAREHOUSE' (Cost=4 Card=41 Bytes=205)
   7    5           INDEX (RANGE SCAN) OF 'PK_LOCATION' (UNIQUE) (Cost=1 Card=210622)
   8    1     VIEW (Cost=14271 Card=48399 Bytes=3242733)
   9    8       SORT (UNIQUE) (Cost=14271 Card=48399 Bytes=6098274)
  10    9         HASH JOIN (Cost=992 Card=187614 Bytes=23639364)
  11   10           HASH JOIN (Cost=186 Card=7449 Bytes=581022)
  12   11             TABLE ACCESS (FULL) OF 'MASTER' (Cost=63 Card=10451 Bytes=156765)
  13   11             HASH JOIN (Cost=105 Card=12489 Bytes=786807)
  14   13               MERGE JOIN (CARTESIAN) (Cost=40 Card=12489 Bytes=549516)
  15   14                 MERGE JOIN (CARTESIAN) (Cost=6 Card=1 Bytes=13)
  16   15                   TABLE ACCESS (FULL) OF 'SYSTEM' (Cost=3 Card=1 Bytes=3)
  17   15                   BUFFER (SORT) (Cost=3 Card=1 Bytes=10)
  18   17                     TABLE ACCESS (FULL) OF 'SYSTEM' (Cost=3 Card=1 Bytes=10)
  19   14                 BUFFER (SORT) (Cost=37 Card=12489 Bytes=387159)
  20   19                   TABLE ACCESS (FULL) OF 'ITEM_SUPPLIER_COUNTRY' (Cost=34 Card=12489 Bytes=387159)
  21   13               TABLE ACCESS (FULL) OF 'SUPPLIER' (Cost=28 Card=24989 Bytes=474791)
  22   10           VIEW (Cost=536 Card=172449 Bytes=8277552)
  23   22             UNION-ALL
  24   23               INDEX (FAST FULL SCAN) OF 'PK_ITEM_SUPPLIER_COUNTRY' (UNIQUE) (Cost=11 Card=24978 Bytes=324714)
  25   23               TABLE ACCESS (FULL) OF 'ITEM_SUPPLIER_COUNTRY' (Cost=34 Card=24978 Bytes=399648)
  26   23               TABLE ACCESS (FULL) OF 'ITEM_SUPPLIER_COUNTRY' (Cost=34 Card=24978 Bytes=374670)
  27   23               TABLE ACCESS (FULL) OF 'ITEM_SUPPLIER_COUNTRY' (Cost=34 Card=24978 Bytes=499560)
  28   23               VIEW (Cost=141 Card=24179 Bytes=1039697)
  29   28                 SORT (UNIQUE) (Cost=141 Card=24179 Bytes=507759)
  30   29                   INDEX (FAST FULL SCAN) OF 'PK_CASE_UPDATES' (UNIQUE) (Cost=14 Card=24179 Bytes=507759)
  31   23               VIEW (Cost=141 Card=24179 Bytes=1039697)
  32   31                 SORT (UNIQUE) (Cost=141 Card=24179 Bytes=507759)
  33   32                   INDEX (FAST FULL SCAN) OF 'PK_CASE_UPDATES' (UNIQUE) (Cost=14 Card=24179 Bytes=507759)
  34   23               VIEW (Cost=141 Card=24179 Bytes=1039697)
  35   34                 SORT (UNIQUE) (Cost=141 Card=24179 Bytes=507759)
  36   35                   INDEX (FAST FULL SCAN) OF 'PK_CASE_UPDATES' (UNIQUE) (Cost=14 Card=24179 Bytes=507759)

So, how could a query of 4 tables produce the above execution plan, and is there anything wrong with the plan?

The OP also generated a 10046 trace and processed that trace file with TKPROF.  The TKPROF output for this SQL statement is as follows:

(select statement)
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.32       0.29          0        593          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   819704    874.76    1529.15    3879188   12619996          5    12295541
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   819706    875.08    1529.45    3879188   12620589          5    12295541

Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 315  (DEVEL)

 Rows     Row Source Operation
 -------  ---------------------------------------------------
12295541  HASH JOIN  (cr=12619996 r=3879188 w=2408 time=1503903336 us)
  212315   VIEW  (cr=7553336 r=2408 w=2408 time=82297538 us)
(as per my prevous post)

Rows     Execution Plan
-------  ---------------------------------------------------
(as per  my previous post)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                  819704        0.00          0.45
  direct path write                               4        0.00          0.00
  direct path read                               51        0.00          0.07
  db file sequential read                     94884        0.41         60.91
  db file scattered read                     222696     2302.70       4675.10
  SQL*Net message from client                819704     3672.87       6299.98
  latch free                                      2        0.00          0.00

How could the query of the 4 tables cause TKPROF to output the above information, and is there anything wrong with the information provided?  To get you started, what would cause a 2,302.70 second wait for a multi-block read?

The OP also posted the following portion of a 10046 trace:

WAIT #3: nam='SQL*Net message to client' ela= 0 p1=1650815232 p2=1 p3=0
FETCH #3:c=0,e=131,p=0,cr=1,cu=0,mis=0,r=15,dep=0,og=4,tim=1180445875641857
WAIT #3: nam='SQL*Net message from client' ela= 704 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message to client' ela= 1 p1=1650815232 p2=1 p3=0
FETCH #3:c=0,e=140,p=0,cr=1,cu=0,mis=0,r=15,dep=0,og=4,tim=1180445875642773
WAIT #3: nam='SQL*Net message from client' ela= 668 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message to client' ela= 1 p1=1650815232 p2=1 p3=0
FETCH #3:c=0,e=140,p=0,cr=2,cu=0,mis=0,r=15,dep=0,og=4,tim=1180445875643631
WAIT #3: nam='SQL*Net message from client' ela= 4179109066 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message to client' ela= 0 p1=1650815232 p2=1 p3=0
FETCH #3:c=10000,e=144,p=0,cr=1,cu=0,mis=0,r=15,dep=0,og=4,tim=1180445875643757
WAIT #3: nam='SQL*Net message from client' ela= 660 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message to client' ela= 0 p1=1650815232 p2=1 p3=0
FETCH #3:c=0,e=128,p=0,cr=1,cu=0,mis=0,r=15,dep=0,og=4,tim=1180445875644594

Anything wrong in the above?

What if we try real hard to analyze the above trace file snippet, anything wrong?  I will cheat and use my Toy Project for Performance Tuning to generate the following output.

Fetch at 1180445875.641860 (Parse to Fetch 1180445875.641860),CPU Time 0.000000,Elapsed Time 0.000131,Rows Retrievd 15,Blks from Buff 1,Blks from Disk 0
     0.000704   SQL*Net message from client
     0.000001   SQL*Net message to client
Fetch at 1180445875.642770 (Parse to Fetch 1180445875.642770),CPU Time 0.000000,Elapsed Time 0.000140,Rows Retrievd 15,Blks from Buff 1,Blks from Disk 0
     0.000668   SQL*Net message from client
     0.000001   SQL*Net message to client
Fetch at 1180445875.643630 (Parse to Fetch 1180445875.643630),CPU Time 0.000000,Elapsed Time 0.000140,Rows Retrievd 15,Blks from Buff 2,Blks from Disk 0
  4179.109066   SQL*Net message from client
     0.000000   SQL*Net message to client
Fetch at 1180445875.643760 (Parse to Fetch 1180445875.643760),CPU Time 0.010000,Elapsed Time 0.000144,Rows Retrievd 15,Blks from Buff 1,Blks from Disk 0
     0.000660   SQL*Net message from client
     0.000000   SQL*Net message to client
Fetch at 1180445875.644590 (Parse to Fetch 1180445875.644590),CPU Time 0.000000,Elapsed Time 0.000128,Rows Retrievd 15,Blks from Buff 1,Blks from Disk 0

Don’t worry too much if the above does not make too much sense.  At 1180445875.641860 seconds there was a fetch call that required 0 CPU seconds, 0.000131 elapsed seconds, 15 rows were retrieved, there was one consistent get, and 0 physical reads.  The next fetch happened 0.00091 (5.642770 – 5.641860) seconds later with two short duration wait events between the two fetch calls (0.000704 seconds on SQL*Net message from client and 0.000001 seconds on SQL*Net message to client).  Take a closer look at the waits between the fetch call at 1180445875.643630 seconds and the fetch call at 1180445875.643760 seconds (0.00013 later).

The OP also posted the following odd snippet from a 10046 trace:

WAIT #3: nam='db file sequential read' ela= 260 p1=354 p2=128875 p3=1
WAIT #3: nam='db file sequential read' ela= 180 p1=354 p2=128878 p3=1
WAIT #3: nam='db file sequential read' ela= 179 p1=354 p2=128877 p3=1
WAIT #3: nam='db file sequential read' ela= 178 p1=354 p2=128880 p3=1
WAIT #3: nam='db file sequential read' ela= 192 p1=354 p2=128879 p3=1
WAIT #3: nam='db file sequential read' ela= 197 p1=354 p2=128882 p3=1
WAIT #3: nam='db file sequential read' ela= 192 p1=354 p2=128881 p3=1
WAIT #3: nam='db file sequential read' ela= 4063882583 p1=354 p2=128884 p3=1
WAIT #3: nam='db file sequential read' ela= 194 p1=354 p2=128883 p3=1
WAIT #3: nam='db file sequential read' ela= 180 p1=354 p2=128885 p3=1
WAIT #3: nam='db file sequential read' ela= 192 p1=354 p2=128887 p3=1
WAIT #3: nam='db file sequential read' ela= 176 p1=354 p2=128886 p3=1
WAIT #3: nam='db file sequential read' ela= 179 p1=354 p2=128889 p3=1
WAIT #3: nam='db file sequential read' ela= 186 p1=354 p2=128888 p3=1

So, what is odd about the above?  Is there an explanation?

The thread contains other interesting details, but it does not look like the OP ever received an explanation for why his query required 45 minutes to execute.  My money is still on a faulty RAID 10 floppy array.