CPU Wait? LAG to the Rescue

26 02 2010

February 26, 2010

A question in a recent OTN thread appeared as follows:

I’m in RAC database (10gR2/Redhat4). I need to store the real CPU wait every 1 minute in a table for a month. For that I thing to query the GV$SYS_TIME_MODEL [for the "DB CPU" statistic].

This is a very difficult question to answer.  Why (additional reference)?  Technically, in Oracle when a session is “on the CPU”, that session is not in a wait event, and therefore not officially waiting for anything. The “DB CPU” statistic captures the accumulated time spent on the CPU by foreground processes – the user sessions. This CPU time does not include the time that the background processes (DBWR, LGWR, PMON, etc.) spend consuming CPU time. Additionally, the “DB CPU” statistic does not consider/accumulate CPU time consumed by processes that are not related to the database instance.  It could also be said that the “DB CPU” time does not account for time that the session spends waiting for its turn to execute on the CPUs.

With the above in mind, let’s see if we are able to calculate the amount of CPU time consumed by the sessions and the background processes in one minute intervals.  First, we need a logging table.  The following SQL statement builds the SYS_TIME_MODEL_CPU table using a couple of the column definitions from the GV$SYS_TIME_MODEL view so that I do not need to explicitly state the column data types (notice that the SQL statement is collapsing data from two source rows into a single row):

CREATE TABLE SYS_TIME_MODEL_CPU AS
SELECT
  SYSDATE CHK_ID,
  INST_ID,
  SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
  SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
FROM
  GV$SYS_TIME_MODEL
WHERE
  0=1
GROUP BY
  INST_ID;

If we are able to find a way to schedule the following SQL statement to execute once a minute, we will be able to store the current values of the “DB CPU” and “background cpu time” statistics with the following SQL statement (note that executing this SQL statement will also consume CPU time, the very thing we are trying to measure):

INSERT INTO SYS_TIME_MODEL_CPU
SELECT
  SYSDATE CHK_ID,
  INST_ID,
  SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
  SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
FROM
  GV$SYS_TIME_MODEL
WHERE
  STAT_NAME IN ('DB CPU','background cpu time')
GROUP BY
  INST_ID;

One way to schedule the SQL statement to execute once a minute is to use the DBMS_LOCK.SLEEP function in a loop.  Unfortunely, on some platforms the function may not wait exactly the specified number of seconds (it may wait slightly longer), and may cause the “PL/SQL lock timer” wait event to steal a position in the top 5 wait events list in a Statspack or AWR report.  For testing purposes, the following anonymous PL/SQL script might be used:

DECLARE
  STime DATE := SYSDATE;
BEGIN
  WHILE (SYSDATE - STime) < 32 LOOP
    INSERT INTO SYS_TIME_MODEL_CPU
      SELECT
        SYSDATE CHK_ID,
        INST_ID,
        SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
        SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
      FROM
        GV$SYS_TIME_MODEL
      WHERE
        STAT_NAME IN ('DB CPU','background cpu time')
      GROUP BY
        INST_ID;

      COMMIT;
      DBMS_LOCK.SLEEP(60);
  End Loop;
End;
/

If we allow the script to run for a couple of minutes (rather than 31 days), we are able to determine how much CPU time was consumed every minute by using the LAG analytic function, as shown below:

SELECT
  TO_CHAR(CHK_ID,'YYYY-MM-DD HH24:MI') CHK_ID,
  INST_ID,
  DB_CPU-LAG(DB_CPU,1) OVER (PARTITION BY INST_ID ORDER BY CHK_ID) DB_CPU,
  BACKGROUND_CPU-LAG(BACKGROUND_CPU,1) OVER (PARTITION BY INST_ID ORDER BY CHK_ID) BACKGROUND_CPU
FROM
  SYS_TIME_MODEL_CPU
ORDER BY
  CHK_ID;

CHK_ID              INST_ID     DB_CPU BACKGROUND_CPU
---------------- ---------- ---------- --------------
2010-02-24 07:18          1
2010-02-24 07:19          1   59990544          66070
2010-02-24 07:20          1   59951475          66724
2010-02-24 07:21          1   59985268          71768
2010-02-24 07:22          1   60000569          63694
2010-02-24 07:23          1   60002938          71639
2010-02-24 07:24          1   59978651          63770
2010-02-24 07:25          1   61487141          62785
2010-02-24 07:26          1      24194          76990

To determine the number of seconds of CPU time consumed, the values shown in the DB_CPU and BACKGROUND_CPU columns should be divided by 1,000,000.

Why not just use AWR data to obtain this information?  Just because AWR is built-in does not mean that it is free to use the features of AWR (as I attempted to argue in this OTN thread) – this is a fact that is often glossed over by various books, blog articles, “news” articles, and even the Oracle Database documentation when it states that AWR reports are the replacement for Statspack reports.


Actions

Information

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 140 other followers

%d bloggers like this: