Brain Teaser: 10046 Extended SQL Trace Shows “EXEC #435118472:c=15600,e=510”, How is that Possible?

14 04 2011

April 14, 2011

I find interesting details from time to time about Oracle Database.  I am in the process of organizing material for a presentation, and I encountered something interesting when testing Oracle Database 11.2.0.2 with patch 3 applied.  I traced the execution of an application and then processed the 660MB  trace file using a trace file profiler that I created.  A portion of my trace file profiler’s output looks like this:

Statement Depth 0 (Application Code)
Cursor 27   Ver 1   Parse at 170.113677 SQL_ID='aw9mpsx4pyta2'  (TD Prev 25.179488)  Similar Cnt 1
|PARSEs       1|CPU S    0.000000|CLOCK S    0.000491|ROWs        0|PHY RD BLKs         0|CON RD BLKs (Mem)         0|CUR RD BLKs (Mem)         0|SHARED POOL MISs      1|
|EXECs    82848|CPU S   25.849376|CLOCK S   26.046921|ROWs    82848|PHY RD BLKs       560|CON RD BLKs (Mem)    248703|CUR RD BLKs (Mem)     84702|SHARED POOL MISs      1|
|FETCHs       0|CPU S    0.000000|CLOCK S    0.000000|ROWs        0|PHY RD BLKs         0|CON RD BLKs (Mem)         0|CUR RD BLKs (Mem)         0|SHARED POOL MISs      0|
  CPU S 20.68%  CLOCK S 20.47%
|                ++++||                ++++|
UPDATE REQUIREMENT SET REQUIRED_DATE = :1 WHERE WORKORDER_TYPE = :2 AND WORKORDER_BASE_ID = :3 AND WORKORDER_SUB_ID = :4
 AND WORKORDER_LOT_ID = :5 AND WORKORDER_SPLIT_ID = :6 AND OPERATION_SEQ_NO = :7 AND PIECE_NO = :8

       (Rows 0)   UPDATE  REQUIREMENT (cr=3 pr=2 pw=0 time=861 us)
       (Rows 1)    INDEX UNIQUE SCAN SYS_C0025561 (cr=3 pr=2 pw=0 time=754 us cost=2 size=33 card=1

The above output shows that this SQL statement, which was executed 82,848 times, consumed a total of 25.85 seconds of CPU time and 26.05 second for the elapsed time.  This SQL statement accounted for 20.68% of the total CPU consumption found in the trace file, and 20.47% of the total elapsed time found in the trace file.  The execution plan displayed is from the first execution of the SQL statement, which is output by default after the first execution starting with Oracle Database 11.1.  Since this SQL statement consumed twice as much CPU and elapsed time as the second highest consumer, I thought that I would take a look in the raw 10046 trace file to see what is happening.  Among other things, I was attempting to determine if there is a time consuming row-level trigger that is firing on each execution, which then caused another row-level trigger to fire.

Here is a portion of that trace file which shows two executions of the SQL statement:

PARSING IN CURSOR #435118472 len=219 dep=0 uid=70 oct=6 lid=70 tim=150927664059 hv=1230988610 ad='466efdfb8' sqlid='aw9mpsx4pyta2'
UPDATE REQUIREMENT SET REQUIRED_DATE = :1 WHERE WORKORDER_TYPE = :2 AND WORKORDER_BASE_ID = :3 AND WORKORDER_SUB_ID = :4
 AND WORKORDER_LOT_ID = :5 AND WORKORDER_SPLIT_ID = :6 AND OPERATION_SEQ_NO = :7 AND PIECE_NO = :8
END OF STMT
PARSE #435118472:c=0,e=491,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=150927664059
WAIT #435118472: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=150927664110
WAIT #435118472: nam='SQL*Net message from client' ela= 364 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=150927664488
...
WAIT #435118472: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=81989 tim=150927704272
WAIT #435118472: nam='SQL*Net message from client' ela= 284 driver id=1413697536 #bytes=1 p3=0 obj#=81989 tim=150927704573
BINDS #435118472:
 Bind#0
  oacdty=96 mxl=32(16) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=256 off=0
  kxsbbbfp=0532a410  bln=32  avl=16  flg=05
  value="2011-04-20-0:0:0"
 Bind#1
  oacdty=96 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=32
  kxsbbbfp=0532a430  bln=32  avl=01  flg=01
  value="W"
 Bind#2
  oacdty=96 mxl=32(05) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=64
  kxsbbbfp=0532a450  bln=32  avl=05  flg=01
  value="16750"
 Bind#3
  oacdty=96 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=96
  kxsbbbfp=0532a470  bln=32  avl=03  flg=01
  value="200"
 Bind#4
  oacdty=96 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=128
  kxsbbbfp=0532a490  bln=32  avl=01  flg=01
  value="1"
 Bind#5
  oacdty=96 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=160
  kxsbbbfp=0532a4b0  bln=32  avl=01  flg=01
  value="0"
 Bind#6
  oacdty=96 mxl=32(02) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=192
  kxsbbbfp=0532a4d0  bln=32  avl=02  flg=01
  value="20"
 Bind#7
  oacdty=96 mxl=32(02) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=224
  kxsbbbfp=0532a4f0  bln=32  avl=02  flg=01
  value="30"
EXEC #435118472:c=0,e=300,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927704902
WAIT #435118472: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=81989 tim=150927704977
WAIT #435118472: nam='SQL*Net message from client' ela= 177 driver id=1413697536 #bytes=1 p3=0 obj#=81989 tim=150927705171
BINDS #435118472:
 Bind#0
  oacdty=96 mxl=32(16) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=256 off=0
  kxsbbbfp=0532a410  bln=32  avl=16  flg=05
  value="2011-04-20-0:0:0"
 Bind#1
  oacdty=96 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=32
  kxsbbbfp=0532a430  bln=32  avl=01  flg=01
  value="W"
...
 Bind#7
  oacdty=96 mxl=32(02) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=224
  kxsbbbfp=0532a4f0  bln=32  avl=02  flg=01
  value="40"
EXEC #435118472:c=15600,e=510,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927705707
WAIT #435118472: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=81989 tim=150927705800
WAIT #435118472: nam='SQL*Net message from client' ela= 1217 driver id=1413697536 #bytes=1 p3=0 obj#=81989 tim=150927707038 

Looking at the above, you may see a couple of odd items.  First, the cursor number is not 27 as was indicated in my trace file profiler’s output – it is instead the huge number 435,118,472.  Tanel Poder provided an explanation of this change which was introduced in Oracle Database 11.2.0.2 here – that change initially caused a lot of problems for my trace file profiler which in hindsight made the silly assumption that no session would have more than 5,000 or 10,000 open cursors (which would thus result in cursors numbers below 10,000).

What I thought that the casual observer might find interesting is this line in the trace file:

EXEC #435118472:c=15600,e=510,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927705707

The above line indicates that the instance consumed 0.015600 seconds of CPU time in a 0.000510 second time period – a bit of an unusual feat to see the CPU consumption 30 times greater than the elapsed time considering that this SQL statement was not executed in parallel.  Is there a clue why this happened?  Take a look at the trace file line that shows the previous execution:

EXEC #435118472:c=0,e=300,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927704902 

This time the elapsed time is 300 / 0 times longer than the CPU time.  So now the brain teaser: if the CPU time is 30 times (15600 / 510) greater than the elapsed time, and then the elapsed time is infinitely (300 / 0) longer than the CPU time, how is it possible that the overall CPU time for this SQL statement is roughly the same as the overall elapsed time?  (On a side note, you might wonder if this could compound into odd time reporting errors if different SQL statements are executed in between the executions of this SQL statement.)

Just for fun, here is the TKPROF output for this SQL statement:

SQL ID: aw9mpsx4pyta2 Plan Hash: 755874150

UPDATE REQUIREMENT SET REQUIRED_DATE = :1
WHERE
 WORKORDER_TYPE = :2 AND WORKORDER_BASE_ID = :3 AND WORKORDER_SUB_ID = :4 AND
  WORKORDER_LOT_ID = :5 AND WORKORDER_SPLIT_ID = :6 AND OPERATION_SEQ_NO = :7
  AND PIECE_NO = :8

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  82848     25.84      26.04        556     248550      84702       82848
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    82849     25.84      26.04        556     248550      84702       82848

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 70 
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  UPDATE  REQUIREMENT (cr=3 pr=2 pw=0 time=861 us)
         1          1          1   INDEX UNIQUE SCAN SYS_C0025561 (cr=3 pr=2 pw=0 time=754 us cost=2 size=33 card=1)(object id 81989)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   82849        0.00          0.10
  SQL*Net message from client                 82849        0.00         19.33
  db file sequential read                        95        0.00          0.03
  db file scattered read                         95        0.00          0.04 

Actions

Information

13 responses

15 04 2011
Narendra

Interesting.
I am yet not aware of all the changes related to how oracle writes data to trace files so not sure if this is specific to 11g.
But could it be due to the fact (and I don’t know if it is still one in 11g) that CPU time accounts for even the tiny amount
of time whereas Elapsed time might not account for the time, if it is too small, which tends to happen when you have lots
of tiny operations carried out in relatively small amount of time.
In other words, could this be similar to the cases where one might see CPU time to be greater than Elapsed time in TkProf,
due to the difference in the way both are measured?

15 04 2011
Charles Hooper

Narendra,

That is one possibility. However, take a look at the following sequential list (except for the portion after … where I skipped several executions) of EXEC calls for this cursor:

EXEC #435118472:c=15600,e=9980,p=6,cr=138,cu=3,mis=1,r=1,dep=0,og=1,plh=755874150,tim=150927674541
EXEC #435118472:c=0,e=871,p=1,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927680575
EXEC #435118472:c=0,e=309,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927681313
EXEC #435118472:c=0,e=319,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927681972
EXEC #435118472:c=0,e=329,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927682611
EXEC #435118472:c=0,e=328,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927683242
EXEC #435118472:c=0,e=328,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927683905
EXEC #435118472:c=0,e=330,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927684541
EXEC #435118472:c=0,e=329,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927685170
EXEC #435118472:c=0,e=333,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927685803
EXEC #435118472:c=0,e=330,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927686432
EXEC #435118472:c=0,e=332,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927687068
EXEC #435118472:c=0,e=335,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927687700
EXEC #435118472:c=0,e=333,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927688336
EXEC #435118472:c=0,e=325,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927688965
EXEC #435118472:c=0,e=983,p=2,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927692030
EXEC #435118472:c=0,e=1080,p=2,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927693979
EXEC #435118472:c=0,e=299,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927695191
EXEC #435118472:c=0,e=308,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927696357
EXEC #435118472:c=0,e=309,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927697541
EXEC #435118472:c=0,e=297,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927698143
EXEC #435118472:c=0,e=299,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927698740
EXEC #435118472:c=0,e=293,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927699316
EXEC #435118472:c=0,e=294,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927699892
EXEC #435118472:c=0,e=299,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927700489
EXEC #435118472:c=0,e=338,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927701178
EXEC #435118472:c=0,e=315,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927701779
EXEC #435118472:c=0,e=295,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927702374
EXEC #435118472:c=0,e=292,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927702945
EXEC #435118472:c=0,e=301,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927703525
EXEC #435118472:c=0,e=303,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927704200
EXEC #435118472:c=0,e=300,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927704902
EXEC #435118472:c=15600,e=510,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927705707
EXEC #435118472:c=0,e=308,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927707393
 
...
 
EXEC #435118472:c=15600,e=351,p=0,cr=3,cu=1,mis=0,r=1,dep=0,og=1,plh=755874150,tim=150927799243

0 and 15,600 (roughly 1/64th of a second) appear to be popular values for the CPU time on an individual EXEC line.

The brain teaser continues.

15 04 2011
DanyC

Hi Charles,

Can you please let me know the patch 3 # so i cna search it on MOS?

Thanks,
Dani

15 04 2011
Charles Hooper

Dani,

I used an advanced patch search on Metalink (MOS) and located the following patches:
11.2.0.2 Windows x64 Patch 3: 11731184 released 2011-03-04 — this is the one I used
11.2.0.2 Windows x64 Patch 4: 11896292 released 2011-04-08

15 04 2011
Centinul

Charles —

Any chance that the resolution of the Windows x64 system timer is larger than the amount of time between each execution? I found an interesting article on MSDN that seems to indicate a 15 ms resolution which, coincidentally, similar to the 15600 us times you are seeing in the trace.

http://msdn.microsoft.com/en-us/magazine/cc163996.aspx

15 04 2011
Charles Hooper

Centinul,

Interesting article, thank you for posting a link. That article lead to other thought processes, which then lead to an interesting SysInternals (Microsoft) utility named ClockRes:
http://technet.microsoft.com/en-us/sysinternals/bb897568

The output on this particular server:

C:\ClockRes>clockres
 
ClockRes v2.0 - View the system clock resolution
Copyright (C) 2009 Mark Russinovich
SysInternals - www.sysinternals.com
 
Maximum timer interval: 15.600 ms
Minimum timer interval: 0.500 ms
Current timer interval: 15.600 ms

The above output seems to match the behavior found in the 10046 trace file, where the CPU time is either reported as 0 or as 15,600. But that also brings me back to a comment that I made at the end of the article:

(On a side note, you might wonder if this could compound into odd time reporting errors if different SQL statements are executed in between the executions of this SQL statement.)

One might wonder if we could find a situation where the CPU usage is reported at 0, when it is actually much higher after 100+ executions.

——————–
Edit (roughly 10 minutes later):
Here is the ClockRes output from an old 3.8GHz Pentium 4 running Windows XP:

ClockRes v2.0 - View the system clock resolution
Copyright (C) 2009 Mark Russinovich
SysInternals - www.sysinternals.com
 
Maximum timer interval: 15.625 ms
Minimum timer interval: 1.000 ms
Current timer interval: 15.625 ms
3 05 2011
GaryS

I am digging through trace files from an AIX system running 11gR2. I have 2 cursors that are logging high amounts of cpu time, but they both look like awfully round numbers. Here are the trace file lines:
====> EXEC #4575888568:c=318740000,e=352279609,p=0,cr=33982302,cu=0,mis=0,r=0,dep=1,og=1,plh=3504418200,tim=75990584749721
====> EXEC #4575992080:c=319670000,e=353631095,p=0,cr=33983344,cu=0,mis=0,r=1,dep=0,og=1,plh=0,tim=75990585978417

It really seems unlikely that the measures for cpu time are precise– there appears to be a granularity to how much time is added. This reminds me that Oracle uses the term ‘quantum’ when referring to waits imposed by Resource Manager. We have that turned on in our database … do you have it turned on in yours, by any chance?

I already checked for an LCD among my values and your magic 15,600. They are all divisible by 400, but I would guess that the granularity on my system is different …. because here is a unique listing of all the c= values from my trace file:

0
10000
20000
30000
40000
50000
60000
70000
90000
100000
360000
760000
318740000
319670000

Looks like I have a granularity of 10,000 and you have a granularity of 15,600. I have plenty of c=0 entries too, by the way, to it appears that Oracle is simply ’rounding down’ when the spirit moves it.

Gary

4 05 2011
Charles Hooper

Hi Gary,

Thank you for providing your finding from AIX. It is interesting that your reported CPU usage intervals are in centisecond (0.01 seconds) increments – now I am a bit curious to know if that observation partially explains why wait events and various other timing events were originally reported in centiseconds in Oracle Databases. Centinul’s comment in this article led me to the ClockRes (Windows) utility that read the CPU clock resolution of the Intel CPU as 15.600 ms, and that seems to correspond to the granularity of 15,600 found in the trace file. The ClockRes utility reported that the minimum CPU clock resolution is 0.500 ms, which would seem to imply that it might be possible to drop the granularity to 500, but I am not sure if this is controllable by the operating system (maybe it would not be a good idea to change this?).

I am not so sure that the reported values are “rounding down” (but I could very well be wrong). The impression that I received is that something like this is happening:

get CPU usage for the current process, remember the value
execute some sort of Oracle operation
get CPU usage for the current process, report the delta from the previous remembered value

Assume that the "some sort of Oracle operation" consumed 1/5 of the CPU reporting granularity value of 10,000 (so it would be consuming 2,000 on each execution).  If that is the case, I could see the wait events appearing something like the following where the CPU usage values are accumulating but not being reported as actually increasing (this matches the pattern found in the trace file above):
EXEC #435118472:c=0,e=2100
EXEC #435118472:c=0,e=1900
EXEC #435118472:c=0,e=2000
EXEC #435118472:c=0,e=2100
EXEC #435118472:c=10000,e=1900
EXEC #435118472:c=0,e=2100
EXEC #435118472:c=0,e=1900
EXEC #435118472:c=0,e=2000
EXEC #435118472:c=0,e=2100
EXEC #435118472:c=10000,e=1900
30 05 2011
Paresh

Hi Charles,

Very interesting brain teaser indeed! Based on what you and others have written and that MSDN article about granularity of 15 ms on windows (which is what your database’s OS is), my hypothesis is:

Oracle uses GetSystemTime widnow system call to get the current system time after each EXEC and does a delta to report c (CPU) values. Oracle uses Performance counters to get current value for elapsed time and does a delta after each EXEC to report e (Elapsed) values. There are multiple EXEC within 15 msec SYSTIME SLOTS i.e. EXEC CPU time is much shorter than 15 msec.

Based on above:

START: SYSTIME = X PERFORMANCE COUNTER TIME = Y
AFTER EXEC1 – SYSTIME = X PERFORMANCE COUNTER TIME = Y+2100 so c=X-X=0 and e=Y+2100-Y=2100
AFTER EXEC2 – SYSTIME = X PERFORMANCE COUNTER TIME = Y+1900 so c=X-X=0 and e=Y+1900-Y=1900
AFTER EXEC3 – SYSTIME = X PERFORMANCE COUNTER TIME = Y+2000 so c=X-X=0 and e=Y+2000-Y=2000
AFTER EXEC4 – SYSTIME = X PERFORMANCE COUNTER TIME = Y+2100 so c=X-X=0 and e=Y+2100-Y=2100
AFTER EXEC5 – SYSTIME = X + 15 msec (15600 to be precise) PERFORMANCE COUNTER TIME = Y+2100 so c=X+15600 -X=15600 and e=Y+2100-Y=2100

Now my question is why note use Performance counter to measure c also? May be someone can put some light on this.

Something similar might be happening on AIX?

Other possible explanation is – I worked on a custom software piece where we had to report the code execution times. We figured that the code to measure the systime after each execution was too much in terms of CPU cost compared to the actual application code’s CPU cost so we took a short cut and used to measure the time after ‘N’ executions reporting c=0 for those intermediate EXECS. The trace was for our internal use only and we as programmers understood that the final C (reported after N EXECs) has to be divided by N to get c (for each EXECs), hope Oracle is not doing this!!!!

Regards,
Paresh

31 05 2011
Charles Hooper

I am not sure that the CPU time rounding situation is specific to Windows. Here is part of a trace file from Oracle Database 11.1.0.6 running on 64 bit Linux:

PARSING IN CURSOR #8 len=64 dep=0 uid=80 oct=3 lid=80 tim=1252087552317846 hv=1840848697 ad='25b5510b0' sqlid='gr26qm9qvk7tt'
SELECT
  ID,
  DESCRIPTION
FROM
  T1
WHERE
  ID BETWEEN 1 AND 10
END OF STMT
PARSE #8:c=0,e=41,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1252087552317842
BINDS #8:
EXEC #8:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1252087552317984
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552318005
FETCH #8:c=0,e=34,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,tim=1252087552318060
WAIT #8: nam='SQL*Net message from client' ela= 84 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552318173
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552318257
FETCH #8:c=0,e=334,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552318530
WAIT #8: nam='SQL*Net message from client' ela= 140 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552318710
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552318793
FETCH #8:c=0,e=325,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552319059
WAIT #8: nam='SQL*Net message from client' ela= 125 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552319212
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552319293
FETCH #8:c=1000,e=333,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552319568
WAIT #8: nam='SQL*Net message from client' ela= 126 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552319726
WAIT #8: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552319806
FETCH #8:c=0,e=323,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552320071
WAIT #8: nam='SQL*Net message from client' ela= 123 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552320223
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552320303
FETCH #8:c=1000,e=335,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552320581
WAIT #8: nam='SQL*Net message from client' ela= 138 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552320751
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552320834
FETCH #8:c=0,e=325,p=0,cr=102,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552321098
WAIT #8: nam='SQL*Net message from client' ela= 155 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552321298
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552321380
FETCH #8:c=1000,e=333,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552321655
WAIT #8: nam='SQL*Net message from client' ela= 152 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552321839
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552321920
FETCH #8:c=0,e=321,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552322183
WAIT #8: nam='SQL*Net message from client' ela= 146 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552322357
WAIT #8: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552322448
FETCH #8:c=1000,e=348,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552322735
WAIT #8: nam='SQL*Net message from client' ela= 125 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552322891
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552322970
FETCH #8:c=0,e=323,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552323236
WAIT #8: nam='SQL*Net message from client' ela= 123 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552323388
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552323470
FETCH #8:c=999,e=343,p=0,cr=102,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552323754
WAIT #8: nam='SQL*Net message from client' ela= 125 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552323909
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552323988
FETCH #8:c=0,e=309,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552324241
WAIT #8: nam='SQL*Net message from client' ela= 123 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552324393
WAIT #8: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=73723 tim=1252087552324471
FETCH #8:c=1000,e=330,p=0,cr=101,cu=0,mis=0,r=100,dep=0,og=1,tim=1252087552324746

Above we see the same rounding issue, although the resolution is 0.001 seconds. I have not analyzed the CPU clock resolution on this particular server with an Intel CPU, but it appears that Linux is using “Minimum timer interval: 1.000 ms” as output by the ClockRes program in a comment above.

A similar trace from 11.2.0.1 on 64 bit Linux:

PARSING IN CURSOR #9 len=64 dep=0 uid=68 oct=3 lid=68 tim=1252093502672955 hv=1840848697 ad='2577c3108' sqlid='gr26qm9qvk7tt'
SELECT
  ID,
  DESCRIPTION
FROM
  T1
WHERE
  ID BETWEEN 1 AND 10
END OF STMT
PARSE #9:c=0,e=60,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=634656657,tim=1252093502672954
EXEC #9:c=0,e=26,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=634656657,tim=1252093502673068
WAIT #9: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093502673106
WAIT #9: nam='db file sequential read' ela= 9741 file#=7 block#=1346035 blocks=1 obj#=69690 tim=1252093502682885
WAIT #9: nam='db file sequential read' ela= 4405 file#=7 block#=2163428 blocks=1 obj#=69690 tim=1252093502687360
WAIT #9: nam='db file sequential read' ela= 223 file#=7 block#=1346036 blocks=1 obj#=69690 tim=1252093502687620
WAIT #9: nam='db file sequential read' ela= 5746 file#=7 block#=1346067 blocks=1 obj#=69689 tim=1252093502693414
FETCH #9:c=0,e=20316,p=4,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=634656657,tim=1252093502693444
WAIT #9: nam='SQL*Net message from client' ela= 163 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093502693638
WAIT #9: nam='db file parallel read' ela= 58637 files=1 blocks=39 requests=39 obj#=69689 tim=1252093502752403
WAIT #9: nam='db file sequential read' ela= 9891 file#=7 block#=1346143 blocks=1 obj#=69689 tim=1252093502762414
WAIT #9: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093502762457
WAIT #9: nam='db file parallel read' ela= 69853 files=1 blocks=39 requests=39 obj#=69689 tim=1252093502832439
WAIT #9: nam='db file sequential read' ela= 8817 file#=7 block#=1351510 blocks=1 obj#=69689 tim=1252093502841471
WAIT #9: nam='db file parallel read' ela= 33278 files=1 blocks=19 requests=19 obj#=69689 tim=1252093502874836
WAIT #9: nam='db file sequential read' ela= 7609 file#=7 block#=1356639 blocks=1 obj#=69689 tim=1252093502882620
FETCH #9:c=3000,e=189031,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252093502882693
WAIT #9: nam='SQL*Net message from client' ela= 176 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093502882917
WAIT #9: nam='db file parallel read' ela= 106424 files=1 blocks=39 requests=39 obj#=69689 tim=1252093502989480
WAIT #9: nam='db file sequential read' ela= 7432 file#=7 block#=1359275 blocks=1 obj#=69689 tim=1252093502997049
WAIT #9: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093502997090
WAIT #9: nam='db file parallel read' ela= 76579 files=1 blocks=39 requests=39 obj#=69689 tim=1252093503073784
WAIT #9: nam='db file sequential read' ela= 7399 file#=7 block#=1364335 blocks=1 obj#=69689 tim=1252093503081439
WAIT #9: nam='db file parallel read' ela= 40248 files=1 blocks=19 requests=19 obj#=69689 tim=1252093503121785
WAIT #9: nam='db file sequential read' ela= 5180 file#=7 block#=1369141 blocks=1 obj#=69689 tim=1252093503127120
FETCH #9:c=4000,e=244247,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252093503127190
WAIT #9: nam='SQL*Net message from client' ela= 206 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093503127440
WAIT #9: nam='db file parallel read' ela= 57799 files=1 blocks=39 requests=39 obj#=69689 tim=1252093503185403
WAIT #9: nam='db file sequential read' ela= 2728 file#=7 block#=1372079 blocks=1 obj#=69689 tim=1252093503188278
WAIT #9: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093503188315
FETCH #9:c=1999,e=192628,p=101,cr=102,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252093505752447
WAIT #9: nam='SQL*Net message from client' ela= 233 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093505752714
WAIT #9: nam='db file parallel read' ela= 50766 files=1 blocks=39 requests=39 obj#=69689 tim=1252093505803620
WAIT #9: nam='db file sequential read' ela= 2276 file#=7 block#=1551063 blocks=1 obj#=69689 tim=1252093505806001
WAIT #9: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093505806034
WAIT #9: nam='db file parallel read' ela= 63492 files=1 blocks=39 requests=39 obj#=69689 tim=1252093505869648
WAIT #9: nam='db file sequential read' ela= 3783 file#=7 block#=1556229 blocks=1 obj#=69689 tim=1252093505873625
WAIT #9: nam='db file parallel read' ela= 36584 files=1 blocks=19 requests=19 obj#=69689 tim=1252093505910296
WAIT #9: nam='db file sequential read' ela= 8168 file#=7 block#=1561485 blocks=1 obj#=69689 tim=1252093505918641
FETCH #9:c=1000,e=165988,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252093505918733
WAIT #9: nam='SQL*Net message from client' ela= 173 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093505918938
WAIT #9: nam='db file parallel read' ela= 59343 files=1 blocks=39 requests=39 obj#=69689 tim=1252093505978402
WAIT #9: nam='db file sequential read' ela= 3738 file#=7 block#=1563944 blocks=1 obj#=69689 tim=1252093505982262
WAIT #9: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=69689 tim=1252093505982295
WAIT #9: nam='db file parallel read' ela= 59378 files=1 blocks=39 requests=39 obj#=69689 tim=1252093506041802
WAIT #9: nam='db file sequential read' ela= 9365 file#=7 block#=1569200 blocks=1 obj#=69689 tim=1252093506051372
WAIT #9: nam='db file parallel read' ela= 32681 files=1 blocks=19 requests=19 obj#=69689 tim=1252093506084149
WAIT #9: nam='db file sequential read' ela= 9004 file#=7 block#=1574268 blocks=1 obj#=69689 tim=1252093506093324
FETCH #9:c=2000,e=174443,p=100,cr=101,cu=0,mis=0,r=100,dep=0,og=1,plh=634656657,tim=1252093506093405

Again, the above shows a resolution of 1ms, while the Windows trace shows a resolution of about 15ms.

I wonder if Metalink (MOS) Doc ID 436797.1, “Poor Performance with STATISTICS_LEVEL=ALL including Excessive Gettimeofday Calls on LINUX 64 or 32-bit” provides any clues? I will have to see if I can locate a 10.2.0.3 or earlier 10046 trace file from a Linux system.

31 05 2011
Charles Hooper

Untested, but it appears that there is a boot.ini setting on Windows systems that controls the timer resolution:

/TIMERES

See:
http://www.pantz.org/software/windows/bootinistartupswitches.html
http://www.windowsnetworking.com/nt/atips/atips136.shtml
http://groups.google.com/group/comp.os.ms-windows.programmer.win32/browse_thread/thread/764f642befe49acb <– also lists a Windows API call

Also untested, an application to adjust the timer resolution:
http://www.lucashale.com/timer-resolution/

15 06 2011
GaryS

Charles,

I posted earlier with some AIX trace file snippets. Digging around later on for an unrelated question, I found Cary Millsap’s paper “Mastering SQL Performance with Extended SQL Trace.” He says on page 9: “For a detailed description of why the CPU consumption statistic is accurate to only +/- 10,000 microseconds, see “Optimizing Oracle Performance” … pp161-165.”

That book says that timing is done by polling and that 10,000 microseconds is a practical limit to how often you should poll for elapsed time.

My guess is that the 11G traces express this lower precision in the reporting cpu timings by appending the trailing 0’s– an improvement in my book.

Gary

15 06 2011
Charles Hooper

Gary,

Nice find – I have the “Optimizing Oracle Book” right next to me. That section of the book seems to exactly answer this brain teaser. The pages seem to suggest that the minimum reported CPU consumption on Linux is c=10000, while one of my comments above shows Oracle Database on Linux reporting c=1000 – so it must be the case that the Scheduler resolution in Linux changed over the years since the book was published (it could be a feature of the CPU platform). Page 163 of the book includes a Perl script that appears to produce the same information as ClockRes on the Windows platform (I have not tested the script on Linux). A quote from page 164:

“At every interrupt, the operating system’s process scheduler tallies one full centisecond (10,000 microseconds) of CPU consumption to whatever process is running at the time.”

Leave a reply to Charles Hooper Cancel reply