CPU Wait? LAG to the Rescue

26 02 2010

February 26, 2010

A question in a recent OTN thread appeared as follows:

I’m in RAC database (10gR2/Redhat4). I need to store the real CPU wait every 1 minute in a table for a month. For that I thing to query the GV$SYS_TIME_MODEL [for the "DB CPU" statistic].

This is a very difficult question to answer.  Why (additional reference)?  Technically, in Oracle when a session is “on the CPU”, that session is not in a wait event, and therefore not officially waiting for anything. The “DB CPU” statistic captures the accumulated time spent on the CPU by foreground processes – the user sessions. This CPU time does not include the time that the background processes (DBWR, LGWR, PMON, etc.) spend consuming CPU time. Additionally, the “DB CPU” statistic does not consider/accumulate CPU time consumed by processes that are not related to the database instance.  It could also be said that the “DB CPU” time does not account for time that the session spends waiting for its turn to execute on the CPUs.

With the above in mind, let’s see if we are able to calculate the amount of CPU time consumed by the sessions and the background processes in one minute intervals.  First, we need a logging table.  The following SQL statement builds the SYS_TIME_MODEL_CPU table using a couple of the column definitions from the GV$SYS_TIME_MODEL view so that I do not need to explicitly state the column data types (notice that the SQL statement is collapsing data from two source rows into a single row):

CREATE TABLE SYS_TIME_MODEL_CPU AS
SELECT
  SYSDATE CHK_ID,
  INST_ID,
  SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
  SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
FROM
  GV$SYS_TIME_MODEL
WHERE
  0=1
GROUP BY
  INST_ID;

If we are able to find a way to schedule the following SQL statement to execute once a minute, we will be able to store the current values of the “DB CPU” and “background cpu time” statistics with the following SQL statement (note that executing this SQL statement will also consume CPU time, the very thing we are trying to measure):

INSERT INTO SYS_TIME_MODEL_CPU
SELECT
  SYSDATE CHK_ID,
  INST_ID,
  SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
  SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
FROM
  GV$SYS_TIME_MODEL
WHERE
  STAT_NAME IN ('DB CPU','background cpu time')
GROUP BY
  INST_ID;

One way to schedule the SQL statement to execute once a minute is to use the DBMS_LOCK.SLEEP function in a loop.  Unfortunely, on some platforms the function may not wait exactly the specified number of seconds (it may wait slightly longer), and may cause the “PL/SQL lock timer” wait event to steal a position in the top 5 wait events list in a Statspack or AWR report.  For testing purposes, the following anonymous PL/SQL script might be used:

DECLARE
  STime DATE := SYSDATE;
BEGIN
  WHILE (SYSDATE - STime) < 32 LOOP
    INSERT INTO SYS_TIME_MODEL_CPU
      SELECT
        SYSDATE CHK_ID,
        INST_ID,
        SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
        SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
      FROM
        GV$SYS_TIME_MODEL
      WHERE
        STAT_NAME IN ('DB CPU','background cpu time')
      GROUP BY
        INST_ID;

      COMMIT;
      DBMS_LOCK.SLEEP(60);
  End Loop;
End;
/

If we allow the script to run for a couple of minutes (rather than 31 days), we are able to determine how much CPU time was consumed every minute by using the LAG analytic function, as shown below:

SELECT
  TO_CHAR(CHK_ID,'YYYY-MM-DD HH24:MI') CHK_ID,
  INST_ID,
  DB_CPU-LAG(DB_CPU,1) OVER (PARTITION BY INST_ID ORDER BY CHK_ID) DB_CPU,
  BACKGROUND_CPU-LAG(BACKGROUND_CPU,1) OVER (PARTITION BY INST_ID ORDER BY CHK_ID) BACKGROUND_CPU
FROM
  SYS_TIME_MODEL_CPU
ORDER BY
  CHK_ID;

CHK_ID              INST_ID     DB_CPU BACKGROUND_CPU
---------------- ---------- ---------- --------------
2010-02-24 07:18          1
2010-02-24 07:19          1   59990544          66070
2010-02-24 07:20          1   59951475          66724
2010-02-24 07:21          1   59985268          71768
2010-02-24 07:22          1   60000569          63694
2010-02-24 07:23          1   60002938          71639
2010-02-24 07:24          1   59978651          63770
2010-02-24 07:25          1   61487141          62785
2010-02-24 07:26          1      24194          76990

To determine the number of seconds of CPU time consumed, the values shown in the DB_CPU and BACKGROUND_CPU columns should be divided by 1,000,000.

Why not just use AWR data to obtain this information?  Just because AWR is built-in does not mean that it is free to use the features of AWR (as I attempted to argue in this OTN thread) – this is a fact that is often glossed over by various books, blog articles, “news” articles, and even the Oracle Database documentation when it states that AWR reports are the replacement for Statspack reports.





Reviewing Session-Level Parameters

25 02 2010

February 25, 2010

A couple of days ago I received an email from a friend who manages a database (10.2.0.4) that is used by the same ERP platform as is used where I work.  The email was quite friendly until I saw the attached file, which included a rather unfriendly ORA-07445 error message and stack trace:

ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_intel_fast_memcpy.A+18] [PC:0x33CFC0E] [ADDR:0x39D10000] [UNABLE_TO_READ] []
Current SQL statement for this session:
update my_table set
costed_date = SYSDATE
where type = 'W'
and base_id = :1       
and lot_id = :2      
and split_id = :3        
and sub_id = '0' 

----- Call Stack Trace -----
calling              call     entry                argument values in hex     
location             type     point                (? means dubious value)    
-------------------- -------- -------------------- ----------------------------
_intel_fast_memcpy.           0000000000000000     000000000 000000000 000000000
A+18                                               000000000
0000000000000001     CALL???  _intel_fast_memcpy.  110F0E2E0 00118F111 0F7714C60
                              A+18                 1160CE740
0000000110F0E2E0     CALL???  0000000000000001     00118F111 0F7714C60 1160CE740
                                                   000000004
updgrh+3747          CALL???  0000000110F0E2E0     0F7714C60 1160CE740 000000004
                                                   000000000
__PGOSF480_upduaw+1  CALL???  updgrh+3747          11DEAD460 00001E768 000000000
58                                                 ACD6D33AF808
kdusru+540           CALL???  __PGOSF480_upduaw+1  00001E658 000000002 00001F2E0
                              58                   00001F430
kauupd+521           CALL???  kdusru+540           110F0DBC0 023037F78 000000000
                                                   000000000
updrow+2086          CALL???  kauupd+521           039CDBA10 110F0DA48 039D0E000
                                                   100000000
__PGOSF536_qerupRow  CALL???  updrow+2086          110E03438 000007FFF 002E4644C
Procedure+93                                       00001F430
qerupFetch+748       CALL???  __PGOSF536_qerupRow  0EB37C530 000000000 000000000
                              Procedure+93         100000000
...

No wonder it crashed with an ORA-07445 Access Violation error – look at the first argument of the __PGOSF480_upduaw+158 calling location – it must be that fatal errors appear in Oracle whenever the hexadecimal number DEAD appears in the middle of an argument. :-)  (Watch out for false correlation when troubleshooting.)  In case you are wondering, Metalink Doc ID 175982.1 “ORA-600 Lookup Error Categories” does not describe the updgrh function that eventually lead to the crash in the _intel_fast_memcpy.A+18 calling location, but there are a couple of articles on Metalink that describe updgrh as a function that is involved in updates to migrated rows.

As an experiment, a logon trigger like the following was created (caution, this logon trigger changes a “hidden” Oracle parameter to disable bind variable peeking – exercise caution, and check with Oracle support, before changing any hidden parameters):

CREATE OR REPLACE TRIGGER "SYS"."NO_PEEK_BIND" AFTER LOGON ON DATABASE
DECLARE
  SHOULD_EXECUTE INTEGER;
BEGIN
  SELECT DECODE(SUBSTR(UPPER(PROGRAM),1,8),'AAAAAAAA',1,0)+DECODE(INSTR(PROGRAM,'\',-1),0,0,DECODE(SUBSTR(UPPER(SUBSTR(PROGRAM,INSTR(PROGRAM,'\',-1)+1)),1,8),'AAAAAAAA',1,0)) INTO SHOULD_EXECUTE FROM V$SESSION WHERE SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1);
  IF SHOULD_EXECUTE > 0 THEN
    EXECUTE IMMEDIATE 'ALTER SESSION SET "_optim_peek_user_binds"=FALSE';
  END IF;
END;
/

The above logon trigger disables bind variable peeking for any program with AAAAAAAA at the start of its filename, even if a portion of the program’s file path (assuming that the path directories are separated by backslashes) is included in the PROGRAM column of V$SESSION.  Once in place, this logon trigger changes the parsing environment for the session such that any previously hard parsed SQL statements that were used by the program in the past would need to be re-hard parsed when executed by the program.  Looking through the remainder of the trace file I noticed that a lot of the bind variables were not set to a value (No bind buffers allocated appeared in the trace file) during the first parse call by the session, so peeking at the bind variables could potentially cause problems if that special case is not handled correctly by Oracle Database 10.2.0.4.  (The stack trace did not suggest that bind variable peeking was the cause of the problem.)

Once the trigger was in place, and new sessions have connected, were are able to confirm that the session-level parameter modification happened with the following SQL statement:

SELECT
  S.SID,
  SUBSTR(S.PROGRAM,1,8) PROGRAM,
  SOE.NAME,
  SOE.VALUE,
  SOE.ISDEFAULT
FROM
  V$SESSION S,
  V$SES_OPTIMIZER_ENV SOE
WHERE
  SUBSTR(UPPER(S.PROGRAM),1,8) IN ('AAAAAAAA','BBBBBBBB')
  AND S.SID=SOE.SID
ORDER BY
  NAME,
  PROGRAM;

 SID PROGRAM  NAME                                 VALUE                     ISD
---- -------- ------------------------------------ ------------------------- ---
 303 AAAAAAAA _optim_peek_user_binds               false                     NO
 303 AAAAAAAA _optimizer_adaptive_cursor_sharing   false                     NO
 335 BBBBBBBB _optimizer_adaptive_cursor_sharing   false                     NO
 303 AAAAAAAA active_instance_count                1                         YES
 335 BBBBBBBB active_instance_count                1                         YES
 303 AAAAAAAA bitmap_merge_area_size               1048576                   YES
 335 BBBBBBBB bitmap_merge_area_size               1048576                   YES
 303 AAAAAAAA cell_offload_compaction              ADAPTIVE                  YES
 335 BBBBBBBB cell_offload_compaction              ADAPTIVE                  YES
 303 AAAAAAAA cell_offload_plan_display            AUTO                      YES
 335 BBBBBBBB cell_offload_plan_display            AUTO                      YES
 303 AAAAAAAA cell_offload_processing              true                      YES
 335 BBBBBBBB cell_offload_processing              true                      YES
 303 AAAAAAAA cpu_count                            2                         YES
 335 BBBBBBBB cpu_count                            2                         YES
 303 AAAAAAAA cursor_sharing                       exact                     YES
 335 BBBBBBBB cursor_sharing                       exact                     YES
 303 AAAAAAAA db_file_multiblock_read_count        32                        YES
 335 BBBBBBBB db_file_multiblock_read_count        32                        YES
 303 AAAAAAAA hash_area_size                       131072                    YES
 335 BBBBBBBB hash_area_size                       131072                    YES
 303 AAAAAAAA is_recur_flags                       0                         YES
 335 BBBBBBBB is_recur_flags                       0                         YES
 303 AAAAAAAA optimizer_capture_sql_plan_baselines false                     YES
 335 BBBBBBBB optimizer_capture_sql_plan_baselines false                     YES
 303 AAAAAAAA optimizer_dynamic_sampling           2                         YES
 335 BBBBBBBB optimizer_dynamic_sampling           2                         YES
 303 AAAAAAAA optimizer_features_enable            11.1.0.7                  YES
 335 BBBBBBBB optimizer_features_enable            11.1.0.7                  YES
 303 AAAAAAAA optimizer_index_caching              0                         YES
 335 BBBBBBBB optimizer_index_caching              0                         YES
 303 AAAAAAAA optimizer_index_cost_adj             100                       YES
 335 BBBBBBBB optimizer_index_cost_adj             100                       YES
 303 AAAAAAAA optimizer_mode                       all_rows                  YES
 335 BBBBBBBB optimizer_mode                       all_rows                  YES
 303 AAAAAAAA optimizer_secure_view_merging        true                      YES
 335 BBBBBBBB optimizer_secure_view_merging        true                      YES
 303 AAAAAAAA optimizer_use_invisible_indexes      false                     YES
 335 BBBBBBBB optimizer_use_invisible_indexes      false                     YES
 303 AAAAAAAA optimizer_use_pending_statistics     false                     YES
 335 BBBBBBBB optimizer_use_pending_statistics     false                     YES
 303 AAAAAAAA optimizer_use_sql_plan_baselines     true                      YES
 335 BBBBBBBB optimizer_use_sql_plan_baselines     true                      YES
 303 AAAAAAAA parallel_ddl_mode                    enabled                   YES
 335 BBBBBBBB parallel_ddl_mode                    enabled                   YES
 303 AAAAAAAA parallel_degree                      0                         YES
 335 BBBBBBBB parallel_degree                      0                         YES
 303 AAAAAAAA parallel_dml_mode                    disabled                  YES
 335 BBBBBBBB parallel_dml_mode                    disabled                  YES
 303 AAAAAAAA parallel_execution_enabled           false                     YES
 335 BBBBBBBB parallel_execution_enabled           false                     YES
 303 AAAAAAAA parallel_query_default_dop           0                         YES
 335 BBBBBBBB parallel_query_default_dop           0                         YES
 303 AAAAAAAA parallel_query_mode                  enabled                   YES
 335 BBBBBBBB parallel_query_mode                  enabled                   YES
 303 AAAAAAAA parallel_threads_per_cpu             2                         YES
 335 BBBBBBBB parallel_threads_per_cpu             2                         YES
 303 AAAAAAAA pga_aggregate_target                 143360 KB                 YES
 335 BBBBBBBB pga_aggregate_target                 143360 KB                 YES
 303 AAAAAAAA query_rewrite_enabled                true                      YES
 335 BBBBBBBB query_rewrite_enabled                true                      YES
 303 AAAAAAAA query_rewrite_integrity              enforced                  YES
 335 BBBBBBBB query_rewrite_integrity              enforced                  YES
 303 AAAAAAAA result_cache_mode                    MANUAL                    YES
 335 BBBBBBBB result_cache_mode                    MANUAL                    YES
 303 AAAAAAAA skip_unusable_indexes                true                      YES
 335 BBBBBBBB skip_unusable_indexes                true                      YES
 303 AAAAAAAA sort_area_retained_size              0                         YES
 335 BBBBBBBB sort_area_retained_size              0                         YES
 303 AAAAAAAA sort_area_size                       65536                     YES
 335 BBBBBBBB sort_area_size                       65536                     YES
 303 AAAAAAAA star_transformation_enabled          false                     YES
 335 BBBBBBBB star_transformation_enabled          false                     YES
 303 AAAAAAAA statistics_level                     typical                   YES
 335 BBBBBBBB statistics_level                     typical                   YES
 303 AAAAAAAA transaction_isolation_level          read_commited             YES
 335 BBBBBBBB transaction_isolation_level          read_commited             YES
 303 AAAAAAAA workarea_size_policy                 auto                      YES
 335 BBBBBBBB workarea_size_policy                 auto                      YES

The above output is obviously not from the 10.2.0.4 database.  As can be determined by the above, the _OPTIM_PEEK_USER_BINDS parameter is set to FALSE only for the program AAAAAAAA.  If you look closely at the output, you will also notice that the _OPTIMIZER_ADAPTIVE_CURSOR_SHARING parameter is set to the non-default value of FALSE for both sessions (this change was made in the spfile), and it would be a logical next step to determine why that hidden parameter was changed from the default.

Did changing the behavior of bind variable peeking solve the ORA-07445 Access Violation error?  Well, that question is off topic for this blog article – the ORA-07445 is just one reason why it might be necessary to explore the parameters that were modified at the session-level.  V$SES_OPTIMIZER_ENV shows the parameters that were modified from the Oracle defaults (check the ISDEFAULT column) at the session-level, and those values may be compared with the system-wide defaults found in the V$SYS_OPTIMIZER_ENV view.  A previous blog article described the V$SQL_OPTIMIZER_ENV view, which shows the parameters that were in effect when a specific child cursor was initially hard parsed.





Eliminating 2 Wait Events from the Top 5, How Many Remain?

24 02 2010

February 24, 2010

A recent question on the OTN forums forced me to stop and think for a moment.  If you eliminate two of the most commonly found wait events from the top five wait events:

  • How many wait events will appear in the top five wait events for a selected time period (hint – something has to appear in the top five)?
  • When the two most commonly found wait events are eliminated from the top five, does that necessarily mean that the database performance is better, or could the performance be worse?

The question from the OTN forum is as follows:

I have a query regarding the 2 events:
1. db file scattered read
2. db file sequential read

If the above two events are in top 5 events then what could be the reason for this and how to resolve the issue?

The question asked is a good question, but something is missing.  Similar questions have appeared on various forums in the past, so this blog article is not specifically addressed to this particular OTN posting.  The original poster excluded a couple of potentially useful pieces of information:

  • What Oracle release and operating system platform is in use?
  • How many seconds of wait time were reported in each of the wait events?
  • What was the total duration of the statistics collection period?
  • Were there any other, potentially more significant wait events in the top five list?
  • How many users were connected to the database?
  • What was the average wait duration for each of the events, and how many waits?
  • Were any business critical processes operating at a slower than expected speed?

There will always be five wait events in a top 5 list, so if the original poster was able to somehow eliminate those datafile type access waits events from the top 5 list, which wait events should take the place of the db file scattered read and db file sequential read waits?  Would it be better to see a different set of wait events, maybe pipe get and inactive session followed closely by cleanup of aborted processes? (I think that there might be a play on words and their definitions here – SHUTDOWN IMMEDIATE; is one certain way to decrease the frequency of datafile type access waits.)





Retrieving the Hidden Oracle Parameters and Saving the Parameter Values in Excel

23 02 2010

February 23, 2010

The are a large number of initialization parameters that configure the behavior of an Oracle database instance, and it seems that the number of hidden parameters (those parameters that begin with _ ) continues to grow with each new Oracle Database release.  In some cases, parameters that were once normal parameters became hidden parameters in later releases (the SPIN_COUNT, and _SPIN_COUNT parameters are one example).  In other cases, the normal parameter defines the minimum requested parameter value and the hidden parameter defines the current value (DB_CACHE_SIZE and __DB_CACHE_SIZE parameters, respectively).

For documentation purposes it might be helpful to permanently record the values of the normal and hidden initialization parameters, and that is the purpose of the VBS script in this article.  The script uses a SQL statement that was originally found here, and then was modified so that the normal and hidden versions of the parameters will sort into adjacent rows when dislayed on screen.  Once the parameter values are written to Excel, the Excel worksheet is automatically saved with the value of the DB_NAME initialization parameter and the current date and time.  When targeting an 11.2.0.1 database, the output might look like the following screen capture:

Unfortunately, we have a slight problem.  The SQL statement in the script must be executed as the SYS user, and if the O7_DICTIONARY_ACCESSIBILITY initialization parameter is set to FALSE (the default starting with Oracle 9.0.1), the normal connection string used in previous scripts will not work (I have not found a way to pass AS SYSDBA when Provider=OraOLEDB.Oracle is specified in the connection string).  If you do not want to set the O7_DICTIONARY_ACCESSIBILITY initialization parameter to TRUE, you will need to create an ODBC connection to the database.

To create an ODBC connection, launch the ODBC Data Source Administrator utility from the Control Panel, and switch to the System DSN tab.  Click Add… then select one of the ODBC providers offered by Oracle Corp (you may need to perform a custom install of the Oracle Client for Oracle’s ODBC client to appear in the list – note that there is a separate 32 bit and 64 bit ODBC Administrator on 64 bit Windows, with the 32 bit version located in the Windows\SysWow64 folder, see this article for more information):

Pick a generic name for the ODBC connection, such as MyODBC (this is the name used in the script) and then enter the database name from the Tnsnames.ora file into the TNS Service Name box (I specified or112) – note that you can change this database name at a later time to point the script at a different database.  Click OK.

The VBS script to extract the normal and hidden parameters follows (modify the script to specify the correct password for the SYS user):

'Version 1.0

Const adCmdText = 1
Const adCmdStoredProc = 4
Const adParamInput = 1
Const adVarNumeric = 139
Const adBigInt = 20
Const adDecimal = 14
Const adDouble = 5
Const adInteger = 3
Const adLongVarBinary = 205
Const adNumeric = 131
Const adSingle = 4
Const adSmallInt = 2
Const adTinyInt = 16
Const adUnsignedBigInt = 21
Const adUnsignedInt = 19
Const adUnsignedSmallInt = 18
Const adUnsignedTinyInt = 17
Const adDate = 7
Const adDBDate = 133
Const adDBTimeStamp = 135
Const adDBTime = 134
Const adVarChar = 200
Const adUseClient = 3

Dim strUsername
Dim strPassword
Dim strDatabase
Dim strSQL                          'SQL statement
Dim objExcel                        'For sending the output to Excel

Dim snpData                         'ADO Recordset object used to retrieve the user's data
Dim dbDatabase                      'ADO database connection object

Dim intTempCount                    'Counter of the number of Excel sheets that have been created

Dim strDBNAME                       'DB_NAME parameter from the database parameters
Dim strLastColumn                   'Column identifier in Excel of the right-most column

Dim i                               'Counter

On Error Resume Next

Set snpData = CreateObject("ADODB.Recordset")
Set dbDatabase = CreateObject("ADODB.Connection")

'Create an Excel connection
Set objExcel = CreateObject("Excel.Application")

'Set up to allow exporting, if requested
objExcel.DisplayAlerts = False
objExcel.Workbooks.Add

strUsername = "sys"    'Must connect as the SYS user
strPassword = "SysPassword"
strDatabase = "MyODBC" 'Must use an ODBC connection if O7_DICTIONARY_ACCESSIBILITY = FALSE

If UCase(strUsername) <> "SYS" Then
  'Can use this for SYS if O7_DICTIONARY_ACCESSIBILITY = TRUE, ODBC connection then not required
  dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";"
Else
  'Must use an ODBC connection if O7_DICTIONARY_ACCESSIBILITY = FALSE
  dbDatabase.ConnectionString = "Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & " AS SYSDBA;"
End If

dbDatabase.Open

'Should verify that the connection attempt was successful, but I will leave that for someone else to code
If Err <> 0 Then
  MsgBox "Not Connected, Error: " & Err
Else
  'Adapted from the SQL statement at http://www.jlcomp.demon.co.uk/params.html
  strSQL = "SELECT " & vbCrLf
  strSQL = strSQL & "  UPPER(NAM.KSPPINM) NAME," & vbCrLf
  strSQL = strSQL & "  VAL.KSPPSTVL VALUE," & vbCrLf
  strSQL = strSQL & "  NAM.INDX+1 NUM," & vbCrLf
  strSQL = strSQL & "  NAM.KSPPITY TYPE," & vbCrLf
  strSQL = strSQL & "  VAL.KSPPSTDF ISDEFAULT," & vbCrLf
  strSQL = strSQL & "  DECODE(BITAND(NAM.KSPPIFLG/256,1),1,'TRUE','FALSE') ISSES_MODIFIABLE," & vbCrLf
  strSQL = strSQL & "  DECODE(BITAND(NAM.KSPPIFLG/65536,3)," & vbCrLf
  strSQL = strSQL & "    1,'IMMEDIATE'," & vbCrLf
  strSQL = strSQL & "    2,'DEFERRED' ," & vbCrLf
  strSQL = strSQL & "    3,'IMMEDIATE'," & vbCrLf
  strSQL = strSQL & "    'FALSE') ISSYS_MODIFIABLE," & vbCrLf
  strSQL = strSQL & "  DECODE(BITAND(VAL.KSPPSTVF,7)," & vbCrLf
  strSQL = strSQL & "    1,'MODIFIED'," & vbCrLf
  strSQL = strSQL & "    4,'SYSTEM MODIFIED'," & vbCrLf
  strSQL = strSQL & "    'FALSE') ISMODIFIED," & vbCrLf
  strSQL = strSQL & "  DECODE(BITAND(VAL.KSPPSTVF,2),2,'TRUE', 'FALSE') ISADJUSTED," & vbCrLf
  strSQL = strSQL & "  NAM.KSPPDESC DESCRIPTION" & vbCrLf
  strSQL = strSQL & "FROM" & vbCrLf
  strSQL = strSQL & "  X$KSPPI NAM," & vbCrLf
  strSQL = strSQL & "  X$KSPPSV VAL" & vbCrLf
  strSQL = strSQL & "WHERE " & vbCrLf
  strSQL = strSQL & "  NAM.INDX = VAL.INDX " & vbCrLf
  strSQL = strSQL & "ORDER BY" & vbCrLf
  strSQL = strSQL & "  UPPER(DECODE(SUBSTR(NAM.KSPPINM,1,2),'__',SUBSTR(NAM.KSPPINM,3)," & vbCrLf
  strSQL = strSQL & "          DECODE(SUBSTR(NAM.KSPPINM,1,1),'_',SUBSTR(NAM.KSPPINM,2),NAM.KSPPINM)))," & vbCrLf
  strSQL = strSQL & "  UPPER(NAM.KSPPINM)"
  snpData.Open strSQL, dbDatabase

  If snpData.State = 1 Then
    If Not (snpData.EOF) Then
      With objExcel
        .Visible = True
        .ActiveWorkbook.Sheets.Add
        .ActiveSheet.Name = "DB Parameters"

         For i = 0 To snpData.Fields.Count - 1
           .ActiveSheet.Cells(1, i + 1).Value = snpData.Fields(i).Name
         Next
         .ActiveSheet.Range(.ActiveSheet.Cells(1, 1), .ActiveSheet.Cells(1, snpData.Fields.Count)).Font.Bold = True

         'Format the columns in the spreadsheet
         For i = 0 To snpData.Fields.Count - 1
           strLastColumn = Chr(64 + ((i + 1) Mod 26))

           .Columns(strLastColumn).Select
           Select Case snpData.Fields(i).Type
             Case adDate, adDBDate, adDBTimeStamp, adDBTime
               .Selection.HorizontalAlignment = -4152
               .Selection.NumberFormat = "mm/dd/yy hh:nn AM/PM"
             Case adBigInt, adInteger, adSmallInt, adTinyInt, adUnsignedBigInt, adUnsignedInt, adUnsignedSmallInt, adUnsignedTinyInt
               .Selection.HorizontalAlignment = -4152
               .Selection.NumberFormat = "0"
             Case adVarNumeric, adDecimal, adDouble, adNumeric, adSingle
               .Selection.HorizontalAlignment = -4152
               .Selection.NumberFormat = "0"
             Case adVarChar
               .Selection.HorizontalAlignment = -4131
               .Selection.NumberFormat = "@"
           End Select
         Next

         strLastColumn = Chr(64 + ((snpData.Fields.Count + 1) Mod 26))
         .ActiveSheet.Range("A2").CopyFromRecordset snpData

         'Auto-fit up to columns
         .ActiveSheet.Columns("A:" & strLastColumn).AutoFit
         .ActiveSheet.Range("B2").Select
         .ActiveWindow.FreezePanes = True

         .Application.DisplayAlerts = False
         'Remove the default worksheets
         For i = 1 To ActiveWorkbook.Sheets.Count
           If .ActiveWorkbook.Sheets(i).Name = "Sheet1" Then
             .Sheets("Sheet1").Select
             .ActiveWindow.SelectedSheets.Delete
           End If
           If .ActiveWorkbook.Sheets(i).Name = "Sheet2" Then
             .Sheets("Sheet2").Select
             .ActiveWindow.SelectedSheets.Delete
           End If
           If .ActiveWorkbook.Sheets(i).Name = "Sheet3" Then
             .Sheets("Sheet3").Select
             .ActiveWindow.SelectedSheets.Delete
           End If
         Next
      End With
      snpData.Close

      'Repeat the SQL statement to find the value of the DB_NAME parameter
      snpData.Open strSQL, dbDatabase

      Do While Not (snpData.EOF)
        If UCase(snpData("name")) = "DB_NAME" Then
          strDBNAME = snpData("value")
          Exit Do
        End If
        snpData.MoveNext
      Loop
    Else
      MsgBox "No Rows Returned"
    End If
  Else
    MsgBox "Could Not Open the SQL Statement " & Err
  End If
  snpData.Close

  objExcel.ActiveWorkbook.SaveAs "C:\OracleParameters " & strDBNAME & " " & Replace(Replace(Now, "/", "-"), ":", "-") & ".xls"
End If

'Clean Up
Set objExcel = Nothing
Set snpData = Nothing
dbDatabase.Close
Set dbDatabase = Nothing

When the script runs it will create a spreadsheet in the root of the C:\ drive – move the spreadsheet as needed for documentation purposes.





CPU Usage Monitoring – What are the Statistics and when are the Statistics Updated?

22 02 2010

February 22, 2010 (Updated March 7, 2010, August 10, 2010)

As mentioned in Chapter 8 of the “Expert Oracle Practices” book, there are several ways to measure the amount of CPU time used, whether at the operating system level, at the instance-wide level, at the session level, or at the SQL statement level.  In a previous blog article I described why driving the database server’s CPUs to 100% utilization was a bad idea and provided quotes from several articles that disagreed as well as several that agreed with my statements (Cary Millsap’s follow-up comment should be read carefully).

I mentioned in the previous blog article that some CPUs will slow their core frequences if thermal thresholds are reached, which might be possible by pegging the CPU at 100% utilization for long periods of time.  Intel offers the Intel Processor Identification Utility for the Windows platform (similar utilities for other platforms and non-Intel processors probably also exist) that shows the rated processor speed and the current CPU speed for each CPU/core in the server.  For example, the following is the output from the Intel program when run on my three year old laptop:

Here is the same laptop, captured in a different time period.  This particular slow down was not caused by a thermal threshold problem, rather it was created by setting the Windows power plan to Power Saver:

In the above, notice that the reported core speed for both CPU cores is 50% of the expected CPU speed, and even the system bus speed dropped slightly (I have seen lower CPU speeds and system bus speeds when the computer wakes from hibernation mode).  (Edit: See the pictures at the end of the article for examples of slower speeds 8+ hours after coming out of hibernation mode.)

Let’s jump into Oracle now (we will use Oracle Database 11.2.0.1 and 11.1.0.6).  There are several statistics in the database views that report CPU utilization.  One such statistic is CPU used by this session.  This is a potentially challenging statistic to use.  From the documentation:

CPU used by this session: Amount of CPU time (in 10s of milliseconds) used by a session from the time a user call starts until it ends. If a user call completes within 10 milliseconds, the start and end user-call time are the same for purposes of this statistics, and 0 milliseconds are added. A similar problem can exist in the reporting by the operating system, especially on systems that suffer from many context switches.

CPU used when call started: The CPU time used when the call is started

If we try to understand the above, the CPU used by this session statistic is the time, in centiseconds, of CPU utilization while a user call is running on the server’s CPUs.  The second statistic, CPU used when call started, indicates the value of the CPU used by this session statistic when the user call was started.  The documentation leaves a couple of questions unanswered:

  • When are these statistics updated?  If the CPU used by this session statistic is updated in real-time (if you are wondering, it is Not), the CPU used when call started statistic could be used to determine how long the session has been on the CPU (and not in a wait event and not in one of the other CPU run states), but if the statistics are not updated in real time, what is the point of having two separate statistics?
  • What is considered a user call – is it the full execution of a user’s SQL statement, or is it a fetch of a block from disk, or is it the completion of a fetch of some of the result rows, or is a user call something entirely different?

We need a test case to see what is happening with these statistics, and several of the other statistics that measure CPU utilization (warning, this test table may take an hour or longer to create).

CREATE TABLE T1 (
  ID NUMBER,
  DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
  CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
  'This is the long description for this number '|| TO_CHAR(CEIL(ABS(SIN(ROWNUM/9.9999)*10000)))
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=10000),
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=10000);

COMMIT;

CREATE INDEX IND_T1 ON T1(ID);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE, METHOD_OPT=>'FOR ALL COLUMNS SIZE 1')

Once the test table is created, disconnect from the database and then reconnect to the database to reset the session-level statistic values.  Note that this test is performed with the FILESYSTEMIO_OPTIONS parameter set to SETALL to enable direct, asychronous I/O.  Next let’s determine the SID for the session:

SELECT
  SID
FROM
  V$MYSTAT
WHERE
  ROWNUM=1;

 SID
----
 552

Start up a second SQL*Plus connection to the database, and prepare to execute the following script (documentation references: V$OSSTAT, V$SESS_TIME_MODEL, V$SESSTAT):

SPOOL CPU_STATS.TXT
SET PAGESIZE 2000
SET ARRAYSIZE 1000

SELECT
  STAT_NAME,
  VALUE
FROM
  V$OSSTAT
WHERE
  STAT_NAME IN ('SYS_TIME','USER_TIME','BUSY_TIME','IDLE_TIME','IOWAIT_TIME','NICE_TIME','RSRC_MGR_CPU_WAIT_TIME')
ORDER BY
  STAT_NAME;

SELECT
  STAT_NAME,
  VALUE
FROM
  V$SESS_TIME_MODEL
WHERE
  STAT_NAME='DB CPU'
  AND SID=552
ORDER BY
  STAT_NAME;

SELECT
  SN.NAME,
  SS.VALUE
FROM
  V$SESSTAT SS,
  V$STATNAME SN
WHERE
  SS.SID=552
  AND SS.STATISTIC#=SN.STATISTIC#
  AND (SN.NAME LIKE 'CPU%'
       OR SN.NAME LIKE '%cpu%'
       OR SN.NAME LIKE '%consistent%')
ORDER BY
  SN.NAME;

host sleep 60

SELECT
  STAT_NAME,
  VALUE
FROM
  V$OSSTAT
WHERE
  STAT_NAME IN ('SYS_TIME','USER_TIME','BUSY_TIME','IDLE_TIME','IOWAIT_TIME','NICE_TIME','RSRC_MGR_CPU_WAIT_TIME')
ORDER BY
  STAT_NAME;

SELECT
  STAT_NAME,
  VALUE
FROM
  V$SESS_TIME_MODEL
WHERE
  STAT_NAME='DB CPU'
  AND SID=552
ORDER BY
  STAT_NAME;

SELECT
  SN.NAME,
  SS.VALUE
FROM
  V$SESSTAT SS,
  V$STATNAME SN
WHERE
  SS.SID=552
  AND SS.STATISTIC#=SN.STATISTIC#
  AND (SN.NAME LIKE 'CPU%'
       OR SN.NAME LIKE '%cpu%'
       OR SN.NAME LIKE '%consistent%')
ORDER BY
  SN.NAME;

host sleep 60

...

SPOOL OFF

In the above script, repeat the three SQL statements (replacing 552 with the SID number returned by your query) between sets of host sleep 60 calls as frequently as required, possibly 30 times, so that the full run of the SQL statement that will be executed by the first session is captured (see page 213 of the “Expert Oracle Practices” book for a Windows equivalent of the Unix/Linux sleep command) .

In the first session, prepare to execute the following script:

SET AUTOTRACE TRACEONLY STATISTICS
SET ARRAYSIZE 1000

SELECT /*+ INDEX(T1) */
  ID,
  DESCRIPTION
FROM
  T1
WHERE
  ID BETWEEN 1 AND 20;

The index hint is added in the above just to make certain that the IND_T1 index is used when we select 1.27% of the rows from the test table.  Start the monitoring script in the second session and immediately start the above script in the first session. 

When the first session completes, it should output something like the following (from the 11.2.0.1 test):

127325 rows selected.

Statistics
---------------------------------------------------
          1  recursive calls
          0  db block gets
     127794  consistent gets
     121986  physical reads
          0  redo size
    7084492  bytes sent via SQL*Net to client
       1757  bytes received via SQL*Net from client
        129  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     127325  rows processed

Or the output might look something like the following (from the 11.1.0.6 test):

127325 rows selected.

Statistics
---------------------------------------------------
        154  recursive calls
          0  db block gets
     127727  consistent gets
     129179  physical reads
          0  redo size
    7084492  bytes sent via SQL*Net to client
       1757  bytes received via SQL*Net from client
        129  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     127325  rows processed

If we take the spool file output that was generated by the second session and calculate the delta values for the statistics, we find the following statistic deltas were captured during each 60 second time period for the 11.2.0.1 test run:

STAT_NAME                                                        1        2        3        4        5        6        7        8        9       10
------------------------------------------------------    -------- -------- -------- -------- -------- -------- -------- -------- -------- --------
IDLE_TIME                                                   48,854   48,753   48,672   48,645   47,920   47,666   48,425   48,652   48,648   48,664
BUSY_TIME                                                      151      147      155      160      896    1,158      426      172      177      162
USER_TIME                                                       29       21       29       32      731    1,001      285       43       48       53
SYS_TIME                                                        55       55       57       54       96       91       69       58       57       49
IOWAIT_TIME                                                  5,825    6,002    5,980    5,978    6,317    6,394    6,087    6,017    6,007    4,750
NICE_TIME                                                        0        0        0        0        0        0        0        0        0        0
RSRC_MGR_CPU_WAIT_TIME                                           0        0        0        0        0        0        0        0        0        0

DB CPU                                                     483,927  485,925  503,924  525,920  471,928  357,946  443,932  462,930  460,930  369,943

CPU used when call started                                      48       45       52       51       48       37       44       47       45       35
CPU used by this session                                        48       45       52       51       48       37       44       47       45       38
parse time cpu                                                   0        0        0        0        0        0        0        0        0        0
recursive cpu usage                                              0        0        0        0        0        0        0        0        0        0
consistent gets                                             13,379   13,659   13,767   14,343   11,657    9,834   12,945   13,666   13,714   10,830
consistent gets from cache                                  13,379   13,659   13,767   14,343   11,657    9,834   12,945   13,666   13,714   10,830
no work - consistent read gets                              13,377   13,659   13,767   14,343   11,657    9,834   12,945   13,666   13,714   10,830
consistent gets from cache (fastpath)                          692      602      622      707      448      442      574      587      569      564
consistent gets - examination                                    2        0        0        0        0        0        0        0        0        0
consistent gets direct                                           0        0        0        0        0        0        0        0        0        0
consistent changes                                               0        0        0        0        0        0        0        0        0        0
transaction tables consistent reads - undo records applied       0        0        0        0        0        0        0        0        0        0
transaction tables consistent read rollbacks                     0        0        0        0        0        0        0        0        0        0
data blocks consistent reads - undo records applied              0        0        0        0        0        0        0        0        0        0
cleanouts only - consistent read gets                            0        0        0        0        0        0        0        0        0        0
rollbacks only - consistent read gets                            0        0        0        0        0        0        0        0        0        0
cleanouts and rollbacks - consistent read gets                   0        0        0        0        0        0        0        0        0        0

The 11.1.0.6 test run (executed after the 11.2.0.1 run completed without shutting down the 11.2.0.1 instance that was running on the same server) showed the following statistics:

STAT_NAME                                                        1        2        3        4
------------------------------------------------------    -------- -------- -------- --------
IDLE_TIME                                                   48,470   48,536   48,547   48,568
BUSY_TIME                                                      593      293      274      236
USER_TIME                                                      345       47       42       35
SYS_TIME                                                       117       87       83       59
IOWAIT_TIME                                                  6,394    5,878    5,887    5,460
NICE_TIME                                                        0        0        0        0
RSRC_MGR_CPU_WAIT_TIME                                           0        0        0        0

DB CPU                                                     669,899  862,868  845,872  671,898

CPU used when call started                                      68       86       82       67
CPU used by this session                                        68       86       82       68
parse time cpu                                                   2        0        0        0
recursive cpu usage                                              1        0        0        0
consistent gets                                             24,985   35,023   35,024   32,699
consistent gets from cache                                  24,985   35,023   35,024   32,699
no work - consistent read gets                              24,972   35,023   35,024   32,699
consistent gets from cache (fastpath)                       24,296   34,076   34,067   31,802
consistent gets - examination                                   13        0        0        0
consistent gets direct                                           0        0        0        0
consistent changes                                               0        0        0        0
transaction tables consistent reads - undo records applied       0        0        0        0
transaction tables consistent read rollbacks                     0        0        0        0
data blocks consistent reads - undo records applied              0        0        0        0
cleanouts only - consistent read gets                            0        0        0        0
rollbacks only - consistent read gets                            0        0        0        0
cleanouts and rollbacks - consistent read gets                   0        0        0        0

As evidenced by the above, the SQL statement completed in just less than 10 minutes in Oracle 11.2.0.1, while the SQL statement in 11.1.0.6 completed in just less than 4 minutes.  Why the significant time difference?  Maybe the above stats provide a clue?

The IDLE_TIME and BUSY_TIME statistics are server-wide with each of the server’s CPUs accumulating one second for every second of elapsed time, with the one second split between these two statistics.  USER_TIME is a server-wide statistic that indicates how many CPU centiseconds were spent executing application/Oracle level activity.  SYS_TIME is a server-wide statistic that indicates how many CPU centiseconds were spent executing operating system code (in kernel mode), typically on behalf of another program that usually accumulates CPU consumption time in the USER_TIME statistic.  In the above, DB CPU is accumulated at the session level and must be divided by 1,000,000 to convert the values to seconds.  The CPU used when call started and CPU used by this session statistics are at the session level and are reported in centiseconds.  If we look at the first time period for the Oracle 11.1.0.6 execution, we are able to derive the following statistics:

  • (48,470 idle centiseconds + 593 busy centiseconds) / (8 CPUs)  = 6132.875 centiseconds = 61.33 seconds in the time period
  • (593 / 100) = CPUs were used for 5.93 seconds of a total possible (61.33 * 8 ) CPU seconds in the time period
  • (593 busy centiseconds) / (48,470 idle centiseconds + 593 busy centiseconds) * 100 = CPUs were 1.21% busy
  • (345 user mode centiseconds) / (345 user mode centiseconds + 117 kernel mode centiseconds) * 100 = 74.68% of the CPU time was consumed in user mode
  • (669,899 / 1,000,000) = Monitored session consumed 0.6699 seconds of CPU time in the 61.33 second time period
  • (669,899 / 1,000,000) / (345 / 100) * 100 = Monitored session consumed 19.41% of the user mode CPU time on the server
  • (68 / 100) = Session of interest consumed 0.68 seconds of CPU time in the 61.33 second time period
  • (2 / 100) = 0.02 seconds session spent parsing the SQL statement
  • (1 / 100) = 0.01 seconds session spent performing recursive calls to optimize the SQL statement
  • (24,296 / 24,985) * 100 = 97.24% of the consistent gets were performed by fastpath (the statistic definition is apparently not documented – I suspect that it might mean that the blocks were read by direct path reads, thus bypassing the buffer cache – but I suppose that it might mean instead that multiple single-block read requests were combined into a single parallel read request)

The DB CPU statistic in the V$SESS_TIME_MODEL view and the various statistics in the V$OSSTAT view are updated in near real-time – the monitored session only consumed about 0.6 seconds of CPU time for every 60 seconds of execution time.  You will notice that for both Oracle Database 11.2.0.1 and 11.1.0.6 the delta values for the CPU used when call started statistic and the CPU used by this session statistic appear as something other than the number 0 throughout the execution time, so the statistics must be updated at least once a minute in this test run, but that appears to contradict the results shown in Table 8-1 on page 201 of the “Expert Oracle Practices” book where those statistics were only updated 5 minutes and 35 seconds after a SQL statement was submitted (in the book, the CPU used by this session statistic was updated to show only the parse CPU time, until 5 minutes and 35 seconds elapsed).  Is the book wrong, or are the above statistic deltas wrong, or maybe there is another explanation?

Fortunately, while the SQL statement executed, I also captured several statistic delta values from V$SQL for the SQL statement using my Toy Project for Performance Tuning, but a script like that found on page 216 of the “Expert Oracle Practices” book will also collect similar statistics.  The captured deltas for the SQL statement on Oracle 11.2.0.1 are as follows (captured roughly every 60 seconds):

CPU Time  Elapsed Time  Parse Calls  Executions  Fetches  Users Opening  Users Executing  Invalidations  Loads  Disk Reads  Buffer Gets
    0.47         56.84            1           1       13              1                1              0      1      12,148       12,840
    0.50         63.62            0           0       15              0                1              0      0      13,648       14,251
    0.52         64.14            0           0       14              0                1              0      0      13,801       14,514
    0.54         61.81            0           0       15              0                1              0      0      13,922       14,572
    0.45         61.15            0           0       11              0                1              0      0      10,598       11,060
    0.44         69.25            0           0       12              0                1              0      0      11,559       12,036
    0.47         64.16            0           0       14              0                1              0      0      13,318       14,041
    0.48         62.28            0           0       14              0                1              0      0      13,439       14,071
    0.47         64.97            0           0       14              0                1              0      0      13,823       14,386
    0.20         26.24            0           0        7              0                0              0      0       5,730        6,023

There were roughly 13 fetches in the first minute, roughly 15 fetches in the second minute, 14 fetches in the third minute, etc.  The statistics in V$SQL are updated at least once every 6 seconds, so the displayed CPU time values (shown in seconds) are reasonably close to actual CPU consumption times.  You might recognize that the above CPU Time delta values are very close to the delta values shown for the CPU used by this session statistic, so does that mean that the CPU used by this session statistic is updated every 6 seconds or less?  That brings us back to the question, what is considered by the documentation as a “user call” in regard to these two statistics?  Given the above evidence, a “user call” in this context actually includes fetch calls.

Because of the inconsistency with which the CPU used by this session statistic is updated (it might not be updated for more than an hour while executing certain SQL statements), it is a better idea to use the DB CPU statistic found in the V$SESS_TIME_MODEL view (available starting with the Oracle 10.1 release) when trying to determine how much CPU time a particular session is consuming.  Of course operating system tools such as ps and top or my VBS script for Windows may also be used to determine which sessions are consuming CPU time.  The V$SESSION and V$SESSION_WAIT views are also able to help you determine which sessions are currently consuming CPU time and roughly how much CPU time has been consumed since the last wait event ended.

But a parallel question remains, why is a particular session consuming CPU time (the above test case is not necessarily the best example for showing uncontrolled CPU usage, try this test case with the OPTIMIZER_FEATURES_ENABLE parameter set to 10.2.0.4 or lower at the session level)?  The  V$SESSION_LONGOPS and V$SQL_MONITOR (introduced in Oracle 11.1, requires a Tuning Pack license to access) views might provide enough information to determine why a session is consuming CPU time.  A 10046 extended SQL trace could also potentially show why a session is consuming an excessive amount of CPU time, unless, of course, nothing is written to the trace files because the session is performing a CPU intensive operation completely in memory.  What then?  Multiple ORADEBUG SHORT_STACK traces (there is a small chance that creating one of these dumps will crash the session) for the session with the help of Metalink Doc ID 175982.1 could provide a clue of what is happening in the session.  If that does not help find the cause of the CPU usage, you could just declare that the Oracle Wait Interface is Useless (Sometimes) and move along to something more productive (I suggest reviewing that blog series).

Added March 7, 2010:
I thought that the laptop was running a bit sluggish today.  It has been in and out of hibernation on a daily basis for the last two weeks.  I thought that I would check the CPU speed just for fun – this is what I received on 2 checks of the performance with the Power Plan set to High Performance and with the laptop plugged into the electrical power:

I guess that explains why this darn laptop feels so slow – as I write this update to the blog article I see that the core frequency has again dropped:

Is Intel trying to strong arm me into buying one of those Core i7 920QX laptops, or is Microsoft trying to get me to upgrade to Windows 7 on this laptop?

—————-

Update August 10, 2010:

The CPU speed has dipped to 0.210GHz several times, possibly caused by overheating:

——

I think that I finally have a workable solution:





Transitive Closure Causing an Execution Plan Short-Circuit

20 02 2010

February 20, 2010

A recent thread on OTN asked the following question:

Sometimes predicates can be generated due to Transitive Closure, but is it possible to see below mentioned behaviour.  Is it possible to create test case for observing this behaviour?

You can even have cases like this one: if n1 > 10 and n1 < 0, then 0 > 10, which is always false, and therefore can short-circuit an entire branch of an execution plan.

In the OTN thread, Jonathan Lewis directly answered the question posted by the original poster, and I provided an answer to a similar question that involved two columns in a test table.  The setup for the test case included the following SQL statements:

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  ROWNUM C2
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

(Note that submitting the test case using my VBS tool makes short work of temporarily changing the STATISTICS_LEVEL and enabling a 10046 trace while generating a DBMS_XPLAN for a SQL statement: Automated DBMS_XPLAN, Trace, and Send to Excel )

The SQL to generate the test case follows:

ALTER SESSION SET STATISTICS_LEVEL='ALL';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SQL_SHORT_CIRCUIT';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

SELECT
  *
FROM
  T1
WHERE
  C1<=100
  AND C2>=10000
  AND C1>C2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SQL_ID  3uk9dajd5cn74, child number 0
-------------------------------------
SELECT T1.* FROM T1 WHERE     C1<=100    AND C2>=10000    AND C1>C2

Plan hash value: 3332582666

---------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   |
---------------------------------------------------------------------------
|*  1 |  FILTER            |      |      1 |        |      0 |00:00:00.01 |
|*  2 |   TABLE ACCESS FULL| T1   |      0 |      1 |      0 |00:00:00.01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NULL IS NOT NULL)
   2 - filter(("C1">"C2" AND "C2"<100 AND "C1"<=100 AND "C1">10000 AND
              "C2">=10000))

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

Notice in the execution plan that the Starts column for the TABLE ACCESS FULL operation is 0 – that line in the execution plan was never executed.  By reviewing the 10046 trace file, you could further confirm that there were no physical reads when the SQL statement executed, which confirms that the full table scan operation was never performed:

PARSING IN CURSOR #2 len=68 dep=0 uid=31 oct=3 lid=31 tim=3008717127 hv=1515606244 ad='a775f1c8'
SELECT T1.* FROM T1 WHERE 
  C1<=100
  AND C2>=10000
  AND C1>C2
END OF STMT
PARSE #2:c=0,e=2320,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=3008717120

EXEC #2:c=0,e=123,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=3008717834
WAIT #2: nam='SQL*Net message to client' ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=3008717930
FETCH #2:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=3008718038
WAIT #2: nam='SQL*Net message from client' ela= 1777 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=3008719953
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='FILTER  (cr=0 pr=0 pw=0 time=5 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=114196 op='TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 time=0 us)'




What is the Meaning of the %CPU Column in an Explain Plan?

19 02 2010

February 19, 2010

(Forward to the Next Post in the Series)

A question recently appeared on the OTN forums asking what %CPU means in an explain plan output.  I did not see a clear definition of that column in the documentation, so I set up a test case.  We will use the test table from this blog article.  Let’s try creating an explain plan on Oracle 11.2.0.1 for a query:

EXPLAIN PLAN FOR
SELECT
  T1.C1,
  T1.C2,
  T1.C3
FROM
  T1,
  (SELECT
    C1,
    C2
  FROM
    T1
  WHERE
    MOD(C1,3)=0) V
WHERE
  T1.C1=V.C1(+)
  AND V.C1 IS NULL
ORDER BY
  T1.C1 DESC;

The above command wrote a couple of rows into the PLAN_TABLE table.  At this point, we should probably consult the documentation to understand the columns in the PLAN_TABLE table.

COST: Cost of the operation as estimated by the optimizer’s query approach. Cost is not determined for table access operations. The value of this column does not have any particular unit of measurement; it is merely a weighted value used to compare costs of execution plans. The value of this column is a function of the CPU_COST and IO_COST columns.

IO_COST: I/O cost of the operation as estimated by the query optimizer’s approach. The value of this column is proportional to the number of data blocks read by the operation. For statements that use the rule-based approach, this column is null.

CPU_COST: CPU cost of the operation as estimated by the query optimizer’s approach. The value of this column is proportional to the number of machine cycles required for the operation. For statements that use the rule-based approach, this column is null.

We found a couple of interesting columns in the PLAN_TABLE table, so let’s query the table

SELECT
  ID,
  COST,
  IO_COST,
  CPU_COST
FROM
  PLAN_TABLE;

 ID  COST  IO_COST   CPU_COST
--- ----- -------- ----------
  0  1482     1467  364928495
  1  1482     1467  364928495
  2   898      887  257272866
  3   889      887   42272866
  4     0        0       2150

Now let’s display the execution plan:

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY);

Plan hash value: 1923834833

--------------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              | 99000 |  1836K|       |  1482   (2)| 00:00:18 |
|   1 |  SORT ORDER BY      |              | 99000 |  1836K|  2736K|  1482   (2)| 00:00:18 |
|   2 |   NESTED LOOPS ANTI |              | 99000 |  1836K|       |   898   (2)| 00:00:11 |
|   3 |    TABLE ACCESS FULL| T1           |   100K|  1367K|       |   889   (1)| 00:00:11 |
|*  4 |    INDEX UNIQUE SCAN| SYS_C0018049 |    10 |    50 |       |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T1"."C1"="C1")
       filter(MOD("C1",3)=0)

The %CPU is 2 for ID 0, 1, and 2, and the %CPU is 1 for ID 3.  Let’s return to the query of the PLAN_TABLE table and perform a couple of calculations:

SELECT
  ID,
  COST,
  IO_COST,
  COST-IO_COST DIFF,
  CEIL(DECODE(COST,0,0,(COST-IO_COST)/COST)*100) PER_CPU,
  CPU_COST
FROM
  PLAN_TABLE;

 ID  COST  IO_COST  DIFF  PER_CPU   CPU_COST
--- ----- -------- ----- -------- ----------
  0  1482     1467    15        2  364928495
  1  1482     1467    15        2  364928495
  2   898      887    11        2  257272866
  3   889      887     2        1   42272866
  4     0        0     0        0       2150

In the above, I subtracted the IO_COST column from the COST column to derive the DIFF column.  I then divided the value in the DIFF column by the COST column, multiplied the result by 100 to convert the number to a percent, and then rounded up the result to derive the PER_CPU column.  The PER_CPU column seems to match the %CPU column in the DBMS_XPLAN output.  Let’s try another SQL statement:

DELETE FROM PLAN_TABLE;

EXPLAIN PLAN FOR
SELECT
  C1
FROM
  T1
WHERE
  'A'||C1 LIKE 'A%';

Now let’s run the query against the PLAN_TABLE table to see if we are able to predict the values that will appear in the %CPU column of the DBMS_XPLAN output:

SELECT
  ID,
  COST,
  IO_COST,
  COST-IO_COST DIFF,
  CEIL(DECODE(COST,0,0,(COST-IO_COST)/COST)*100) PER_CPU,
  CPU_COST
FROM
  PLAN_TABLE;

 ID  COST  IO_COST  DIFF  PER_CPU   CPU_COST
--- ----- -------- ----- -------- ----------
  0    54       52     2        4   43331709
  1    54       52     2        4   43331709

The above indicates that the %CPU column should show the number 4 on both rows of the execution plan.

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY);

Plan hash value: 2950179127

-------------------------------------------------------------------------------------
| Id  | Operation            | Name         | Rows  | Bytes | Cost (%CPU)|  Time    |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |  5000 | 25000 |    54   (4)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| SYS_C0018049 |  5000 | 25000 |    54   (4)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter('A'||TO_CHAR("C1") LIKE 'A%')

One of my previous blog articles showed the following execution plan – this was the actual plan displayed by DBMS_XPLAN.DISPLAY_CURSOR after the SQL statement executed:

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   247 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access(:Z>=:Z AND :Z<=:Z)
       filter(("C1"<=10000 AND "C1">=1)) 

Is there anything strange about the %CPU column in the above plan?

Incidentally, a query of SYS.AUX_STATS$ shows the following output (values are used to determine the impact of the CPU_COST column that is displayed in the PLAN_TABLE table):

SELECT
  PNAME,
  PVAL1
FROM
  SYS.AUX_STATS$
WHERE
  PNAME IN ('CPUSPEED','CPUSPEEDNW');

PNAME           PVAL1
---------- ----------
CPUSPEEDNW   2031.271
CPUSPEED 







Follow

Get every new post delivered to your Inbox.

Join 143 other followers