TKPROF Elapsed Time Challenge – the Elapsed Time is Half of the Wait Event Time

20 05 2012

May 20, 2012

An interesting quirk was recently brought to my attention by Mich Talebzadeh.  He generated a 10046 trace at level 8 for a session, executed some SQL statements, disabled the trace, and then processed the resulting trace file with TKPROF.  His TKPROF output included the following:

UPDATE TESTWRITES SET PADDING1 = RPAD('y',4000,'y')
WHERE
OBJECT_ID = :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        4      0.00       0.00          0          0          0           0
Execute  17293     21.62    1245.41    1064667    2411398    6280588     1000000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    17297     21.62    1245.41    1064667    2411398    6280588     1000000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 93     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UPDATE  TESTWRITES (cr=214 pr=427 pw=0 time=0 us)
    100   INDEX UNIQUE SCAN TESTWRITES_PK (cr=110 pr=3 pw=0 time=0 us cost=2 size=4007 card=1)(object id 92380)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  db file sequential read                   2160263        1.57       2304.22
  db file scattered read                      27769        0.26         99.78 

Mich noticed that the reported elapsed time, at 1,245.41 seconds, is less than the sum of the wait event times, at 2,400.00 seconds.  Typically, the sum of the wait events plus the CPU time should be equal to or slightly less than the elapsed time (assuming that parallel query is not used, and the CPU reporting granularity did not cause odd side effects, and the session’s process did not spend much time in the CPU run queue).  In this case, the elapsed time is about 51.9% of the wait event time for this SQL statement. 

How is this possible?  Mich stated the following:

“A comment has been it is due to using different clock granularities.”

Think about that for a minute – what could explain the odd inconsistency in the output?

I recall reading in the book “Optimizing Oracle Performance” about double-counting of statistics due to recursive database calls.  Page 92 of that book includes the following quote:

“In Oracle releases through at least Oracle9i Release 2, a database call’s c, e, p, cr, and cu statistics contain an aggregation of the resources consumed by the database call itself and its entire recursive progeny.”

Page 91 of the same book makes the following statement:

“The rule for determining the recursive relationships among database calls is simple:
A database call with dep=n + 1 is the recursive child of the first subsequent dep=n database call listed in the SQL trace data stream.”

Could it be the case that this particular issue with recursive calls was addressed in later releases of TKPROF?  Might that fix explain why the elapsed time is 51.9% of the wait event time?

The OP provided a partial test case script with a description of how the tables were created.  I reduced the size of the original source table from 1,700,000 rows to 60,000 rows, and reduced the secondary source table from 1,000,000 rows to 50,000 rows.  My test script follows (note that the line that builds the IND_T2_OBJ index is commented out – we will change that later):

DROP TABLE T1 PURGE;
DROP TABLE T2 PURGE;

CREATE TABLE T1 AS
SELECT
  *
FROM
  ALL_OBJECTS
WHERE
  ROWNUM<=60000;

ALTER TABLE T1 ADD (
  PADDING1 VARCHAR2(4000),
  PADDING2 VARCHAR2(4000));

CREATE TABLE T2 AS
SELECT
  *
FROM
  T1
WHERE
  ROWNUM<=50000;

CREATE UNIQUE INDEX IND_T1_OBJ ON T1(OBJECT_ID);
--CREATE UNIQUE INDEX IND_T2_OBJ ON T2(OBJECT_ID);

ALTER SYSTEM FLUSH SHARED_POOL;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

EXEC DBMS_MONITOR.SESSION_TRACE_ENABLE ( waits=>true, plan_stat=>'ALL_EXECUTIONS' );

DECLARE
        type ObjIdArray is table of T1.object_id%TYPE index by binary_integer;
        l_ids objIdArray;
        CURSOR c IS SELECT object_id FROM T1;
BEGIN
        OPEN c;
        LOOP
                BEGIN
                        FETCH c BULK COLLECT INTO l_ids LIMIT 100;

                        FORALL rs in 1 .. l_ids.COUNT
                                UPDATE T2
                                  SET PADDING1 =  RPAD('y',4000,'y')
                                WHERE object_id = l_ids(rs);
                        commit;
                        FORALL rs in 1 .. l_ids.COUNT
                                DELETE FROM T2
                                WHERE object_id = l_ids(rs);
                        commit;
                        FORALL rs in 1 .. l_ids.COUNT
                                INSERT INTO T2
                                SELECT * FROM T1 t WHERE t.object_id = l_ids(rs);
                        commit;
                        EXIT WHEN C%NOTFOUND;
                EXCEPTION
                  WHEN NO_DATA_FOUND THEN
                    NULL;
                  WHEN OTHERS THEN
                    DBMS_OUTPUT.PUT_LINE('Transaction failed');
                END;
        END LOOP;
        CLOSE c;
END;
/

EXEC DBMS_MONITOR.SESSION_TRACE_DISABLE;

That test case definitely contains calls in the trace file at a dep greater than zero.  As a matter of fact, the SQL statement of interest is at a dep of one, resulting in space management (and other calls) at a dep of two or greater:

PARSING IN CURSOR #397285536 len=66 dep=1 uid=64 oct=6 lid=64 tim=1879257875778 hv=3592437181 ad='3edea5858' sqlid='1ahnbczb20gdx'
UPDATE T2 SET PADDING1 = RPAD('y',4000,'y') WHERE OBJECT_ID = :B1 
END OF STMT
PARSE #397285536:c=0,e=169,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=1879257875778

I executed the test case script on 64 bit Windows with Oracle Database 11.2.0.2 with patch 5 installed.  I processed the resulting trace file with TKPROF from the 32 bit Windows Oracle 11.2.0.1 client with the ODBC update patch 10155837 installed.  My TKPROF output for the UPDATE statement showed the following:

SQL ID: 1ahnbczb20gdx
Plan Hash: 2722410703
UPDATE T2 SET PADDING1 = RPAD('y',4000,'y') 
WHERE
 OBJECT_ID = :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    600    197.38     197.60        452   50299237     426582       50000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      601    197.38     197.60        452   50299237     426582       50000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  UPDATE  T2 (cr=80409 pr=436 pw=0 time=357525 us)
       100        100        100   TABLE ACCESS FULL T2 (cr=80182 pr=420 pw=0 time=337295 us cost=116 size=16120 card=8)
         0          0          0  UPDATE  T2 (cr=83795 pr=0 pw=0 time=344133 us)
       100        100        100   TABLE ACCESS FULL T2 (cr=83598 pr=0 pw=0 time=332180 us cost=116 size=16120 card=8)
         0          0          0  UPDATE  T2 (cr=83790 pr=2 pw=0 time=303134 us)
       100        100        100   TABLE ACCESS FULL T2 (cr=83596 pr=0 pw=0 time=288815 us cost=116 size=16120 card=8)
         0          0          0  UPDATE  T2 (cr=83789 pr=2 pw=0 time=283738 us)
       100        100        100   TABLE ACCESS FULL T2 (cr=83595 pr=0 pw=0 time=271190 us cost=116 size=16120 card=8)
...
         0          0          0  UPDATE  T2 (cr=87900 pr=0 pw=0 time=375297 us)
         0          0          0   TABLE ACCESS FULL T2 (cr=87900 pr=0 pw=0 time=375015 us cost=116 size=16120 card=8)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                        36        0.00          0.01
  db file scattered read                         31        0.00          0.01

I decided to compare the output of TKPROF with the output of my Hyper-Extended Oracle Performance Monitor program, which I know does not attempt to resolve parent and child recursive statistics:

Statement Depth 1 (Trigger Code)
Cursor 220   Ver 1   Parse at 0.254283 SQL_ID='1ahnbczb20gdx'  (TD Prev 0.000877)  Similar Cnt 1
|PARSEs       1|CPU S    0.000000|CLOCK S    0.000169|ROWs        0|PHY RD BLKs         0|CON RD BLKs (Mem)         0|CUR RD BLKs (Mem)         0|SHARED POOL MISs      1|
|EXECs      600|CPU S  197.403668|CLOCK S  197.659011|ROWs    50000|PHY RD BLKs       774|CON RD BLKs (Mem)  50299674|CUR RD BLKs (Mem)    426582|SHARED POOL MISs      1|
|FETCHs       0|CPU S    0.000000|CLOCK S    0.000000|ROWs        0|PHY RD BLKs         0|CON RD BLKs (Mem)         0|CUR RD BLKs (Mem)         0|SHARED POOL MISs      0|
  CPU S 25.08%  CLOCK S 25.01%
|               +++++||               +++++|
UPDATE T2 SET PADDING1 = RPAD('y',4000,'y') WHERE OBJECT_ID = :B1

       (Rows 0)   UPDATE  T2 (cr=80409 pr=436 pw=0 time=357525 us)
     (Rows 100)    TABLE ACCESS FULL T2 (cr=80182 pr=420 pw=0 time=337295 us cost=116 size=16120 card=8)

       (Rows 0)   UPDATE  T2 (cr=83795 pr=0 pw=0 time=344133 us)
     (Rows 100)    TABLE ACCESS FULL T2 (cr=83598 pr=0 pw=0 time=332180 us cost=116 size=16120 card=8)
       (Rows 0)   UPDATE  T2 (cr=83790 pr=2 pw=0 time=303134 us)
     (Rows 100)    TABLE ACCESS FULL T2 (cr=83596 pr=0 pw=0 time=288815 us cost=116 size=16120 card=8)
...

Notice that the TKPROF summary shows 0.059011 fewer elapsed seconds for this SQL statement, 0.023668 fewer CPU seconds, 322 fewer blocks read from disk, 437 fewer consistent gets, and the same number of current mode reads.  Why the differences?

My program wrote the wait event details for the UPDATE statement to an Excel spreadsheet.  Examination of the wait events indicated that there were 36 db file sequential read waits (thus 36 total blocks read from disk), and 31 db file scattered read waits (416 total blocks read from disk) – the wait event details support the statistics output by TKPROF.  So, maybe the double counting of recursive calls was fixed since the release of Oracle Database 9.2.  Nice!  But we are not done yet, we still do not have the index on table T2, as was described by the OP.

Locate in the script the following line:

--CREATE UNIQUE INDEX IND_T2_OBJ ON T2(OBJECT_ID);

Change that line by removing the at the start of the line, so that the line appears like this:

CREATE UNIQUE INDEX IND_T2_OBJ ON T2(OBJECT_ID);

Disconnect from the database, then reconnect (so that you will be assigned a different default trace filename).  Now, re-execute the script, and process the resulting trace file with TKPROF.  My TKPROF output for the UPDATE SQL statement looks like this:

SQL ID: 1ahnbczb20gdx
Plan Hash: 1751280057
UPDATE T2 SET PADDING1 = RPAD('y',4000,'y') 
WHERE
 OBJECT_ID = :B1 

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    190      1.35       1.42        348      52552     165435       19000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      191      1.35       1.42        348      52552     165435       19000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  UPDATE  T2 (cr=332 pr=30 pw=0 time=19223 us)
       100        100        100   INDEX UNIQUE SCAN IND_T2_OBJ (cr=105 pr=8 pw=0 time=1372 us cost=1 size=2015 card=1)(object id 103855)
         0          0          0  UPDATE  T2 (cr=288 pr=2 pw=0 time=8629 us)
       100        100        100   INDEX UNIQUE SCAN IND_T2_OBJ (cr=105 pr=0
...
         0          0          0  UPDATE  T2 (cr=291 pr=0 pw=0 time=6051 us)
       100        100        100   INDEX UNIQUE SCAN IND_T2_OBJ (cr=105 pr=0 pw=0 time=430 us cost=1 size=2015 card=1)(object id 103855)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                         40        0.00          0.03
  db file sequential read                        32        0.00          0.01
  log file switch completion                      2        0.00          0.01

For the sake of comparison, I also processed the trace file using my Hyper-Extended Oracle Performance Monitor program.  This is my program’s output:

Statement Depth 1 (Trigger Code)
Cursor 910   Ver 1   Parse at 0.187068 SQL_ID='1ahnbczb20gdx'  (TD Prev 0.000996)  Similar Cnt 1
|PARSEs       1|CPU S    0.000000|CLOCK S    0.000162|ROWs        0|PHY RD BLKs         0|CON RD BLKs (Mem)         0|CUR RD BLKs (Mem)         0|SHARED POOL MISs      1|
|EXECs      600|CPU S    3.603618|CLOCK S    3.683344|ROWs    50000|PHY RD BLKs       912|CON RD BLKs (Mem)    144826|CUR RD BLKs (Mem)    426496|SHARED POOL MISs      1|
|FETCHs       0|CPU S    0.000000|CLOCK S    0.000000|ROWs        0|PHY RD BLKs         0|CON RD BLKs (Mem)         0|CUR RD BLKs (Mem)         0|SHARED POOL MISs      0|
  CPU S 16.91%  CLOCK S 15.45%
|                 +++||                 +++|
UPDATE T2 SET PADDING1 = RPAD('y',4000,'y') WHERE OBJECT_ID = :B1

       (Rows 0)   UPDATE  T2 (cr=332 pr=30 pw=0 time=19223 us)
     (Rows 100)    INDEX UNIQUE SCAN IND_T2_OBJ (cr=105 pr=8 pw=0 time=1372 us cost=1 size=2015 card=1)

       (Rows 0)   UPDATE  T2 (cr=288 pr=2 pw=0 time=8629 us)
     (Rows 100)    INDEX UNIQUE SCAN IND_T2_OBJ (cr=105 pr=0 pw=0 time=430 us cost=1 size=2015 card=1)
...
       (Rows 0)   UPDATE  T2 (cr=105 pr=0 pw=0 time=193 us)
       (Rows 0)    INDEX UNIQUE SCAN IND_T2_OBJ (cr=105 pr=0 pw=0 time=153 us cost=1 size=2015 card=1)

Notice that there is now a significant difference in the elapsed time at 2.263344 seconds difference, CPU seconds at a difference of 2.253618 seconds, and the number of blocks read from disk at a difference of 564 blocks.

Let’s take a look at the wait events that my program wrote to Excel.   Examination of the wait events indicated that there were 42 db file sequential read waits (thus 42 total blocks read from disk), and 107 db file scattered read waits (842 total blocks read from disk), for a grand total of 884 blocks read from disk that are associated with this UPDATE SQL statement.  This time, not only are the TKPROF statistics not supported by the raw wait events, but the wait event summary included in the TKPROF summary also is not supported by the raw wait events.  Where did those number from TKPROF originate?

Well, let’s just process the 10046 trace file using TKPROF from 11.2.0.2 to confirm the output:

SQL ID: 1ahnbczb20gdx Plan Hash: 1751280057

UPDATE T2 SET PADDING1 = RPAD('y',4000,'y') 
WHERE
 OBJECT_ID = :B1 

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    600      3.57       3.64        884     144451     426496       50000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      601      3.57       3.64        884     144451     426496       50000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 64     (recursive depth: 1)
Number of plan statistics captured: 600

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  UPDATE  T2 (cr=241 pr=1 pw=0 time=5550 us)
       100         83        100   INDEX UNIQUE SCAN IND_T2_OBJ (cr=106 pr=0 pw=0 time=425 us cost=1 size=2015 card=1)(object id 103855)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                        107        0.00          0.06
  db file sequential read                        42        0.00          0.01
  log file switch completion                      2        0.00          0.01

-

Those TKPROF numbers appear quite different; translation – it you do not like the numbers produced by TKPROF, just use a different version of TKPROF.  :-)

It appears that TKPROF in 11.2.0.1 potentially lies when reporting on a trace file generated by Oracle Database 11.2.0.2.  I wonder if this problem is caused by the large cursor number in 11.2.0.2+, or if TKPROF in 11.2.0.1 also has the same problem with trace files generated by Oracle Database 11.2.0.1?

Is anyone able to reproduce the problem reported by the OP?

If you wish to experiment with my trace file and TKPROF summaries, you may download the following file:

or1122p_ora_1808.zip

When you download the file, rename the download from or1122p_ora_1808-zip.doc to or1122p_ora_1808.zip.  The enclosed or1122p_ora_1808.trc file is the raw 10446 trace file, or1122p_ora_1808.txt is the 11.2.0.1 TKPROF generated output file, and or1122p_ora_1808-2.txt is the 11.2.0.2 generated output file.








Follow

Get every new post delivered to your Inbox.

Join 137 other followers