Consistently Inconsistent Consistent Gets

16 03 2010

March 16, 2010

You might be aware that the number of consistent gets performed during the execution of a SQL statement will vary as the execution plan for a SQL statement changes.  You might be aware that the number of consistent gets for a SQL statement will vary depending on the block size used by the database.  You might be aware that the number of consistent gets will vary depending on the clustering factor of the indexes used by the SQL statement change (assuming that the table rows are actually accessed).  You might be aware that moving a table and rebuilding  its indexes could cause the number of consistent gets for a SQL statement to change.  But were you aware that there are other reasons for the number of consistent gets to change?  Let’s set up a simple test table in the database that has 1,000,000 rows and an index on the primary key column:

CREATE TABLE T5 (
  C1 VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (C1));

INSERT INTO T5 NOLOGGING
SELECT
  'A'||TO_CHAR(ROWNUM,'0000000'),
  RPAD('A',100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T5',CASCADE=>TRUE)

In Oracle Database, for instance release 11.1.0.6 on 64 bit Linux, we craft a silly SQL statement that uses the test table.  The silly SQL statement’s goal is just to drive up the number of consistent gets to allow us to peg the server’s CPU.  This is the silly SQL statement that essentially joins table T5 to itself multiple times:

SELECT /*+ USE_NL(T51 T52 T53 T54 T55 T56 T57 T58) */
  T51.C1,
  T51.C2,
  T52.C1,
  T52.C2,
  T53.C1,
  T53.C2,
  T54.C1,
  T54.C2,
  T55.C1,
  T55.C2,
  T56.C1,
  T56.C2,
  T57.C1,
  T57.C2,
  T58.C1,
  T58.C2
FROM
  T5 T51,
  T5 T52,
  T5 T53,
  T5 T54,
  T5 T55,
  T5 T56,
  T5 T57,
  T5 T58
WHERE
  T51.C1=T52.C1
  AND T51.C1=T53.C1
  AND T51.C1=T54.C1
  AND T51.C1=T55.C1
  AND T51.C1=T56.C1
  AND T51.C1=T57.C1
  AND T51.C1=T58.C1
  AND T51.C1 BETWEEN 'A 0000000' AND 'A 1000000';

In the above SQL statement I have forced through a hint a nested loop join between the various aliases for the table T5 to force the use of the index, rather than letting the optimizer use full table scans and hash joins.  Let’s see how the array fetch size affects the number of consistent gets for the above SQL statement.  Articles on other blogs have shown demonstrations of the same effect of changing the array fetch size, but stay with me, there is a twist.  The script:

SPOOL consistent_gets.txt

SET AUTOTRACE TRACEONLY STATISTICS
SET TIMING ON

SET ARRAYSIZE 1

SELECT /*+ USE_NL(T51 T52 T53 T54 T55 T56 T57 T58) */
  T51.C1,
  T51.C2,
  T52.C1,
  T52.C2,
  T53.C1,
  T53.C2,
  T54.C1,
  T54.C2,
  T55.C1,
  T55.C2,
  T56.C1,
  T56.C2,
  T57.C1,
  T57.C2,
  T58.C1,
  T58.C2
FROM
  T5 T51,
  T5 T52,
  T5 T53,
  T5 T54,
  T5 T55,
  T5 T56,
  T5 T57,
  T5 T58
WHERE
  T51.C1=T52.C1
  AND T51.C1=T53.C1
  AND T51.C1=T54.C1
  AND T51.C1=T55.C1
  AND T51.C1=T56.C1
  AND T51.C1=T57.C1
  AND T51.C1=T58.C1
  AND T51.C1 BETWEEN 'A 0000000' AND 'A 1000000';

SET ARRAYSIZE 1

/

SET ARRAYSIZE 10

/

SET ARRAYSIZE 50

/

SET ARRAYSIZE 100

/

SET ARRAYSIZE 200

/

SET ARRAYSIZE 500

/

SET ARRAYSIZE 1000

/

SET ARRAYSIZE 2000

/

SET ARRAYSIZE 5000

/

SPOOL OFF

We will throw out the first execution with the array fetch size set at 1 (the client computer is using the 11.1.0.7 client) so that the recursive call does not throw off the timing.  The summarized output of the above script follows:

           Fetch Array:  1            10           50          100          200          500          1000         2000         5000
Elapsed             00:05:36.83  00:02:30.86  00:01:47.15  00:01:38.21  00:01:29.87  00:01:24.80  00:01:23.08  00:01:22.25  00:01:22.49
recursive calls               0            0            0            0            0            0            0            0            0
db block gets                 0            0            0            0            0            0            0            0            0
consistent gets      12,554,387    9,453,633    8,822,789    8,696,435    8,625,467    8,582,840    8,568,603    8,561,488    8,559,626
physical reads                0            0            0            0            0            0            0            0            0
redo size                     0            0            0            0            0            0            0            0            0
bytes sent via SQL  129,001,789   94,601,789   87,721,789   86,861,789   86,431,789   86,173,789   86,087,789   86,044,789   86,018,989
bytes received via    5,500,349    1,100,349      220,349      110,349       55,349       22,349       11,349        5,849        2,549
SQL*Net roundtrips      500,001      100,001       20,001       10,001        5,001        2,001        1,001          501          201
sorts (memory)                0            0            0            0            0            0            0            0            0
sorts (disk)                  0            0            0            0            0            0            0            0            0
rows processed        1,000,000    1,000,000    1,000,000    1,000,000    1,000,000    1,000,000    1,000,000    1,000,000    1,000,000

As you can see from the above, when the fetch array size was set to 1, SQL*Plus actually operated as if the fetch array size was set to 2.  As the fetch array size increased, the execution time for the query decreased until the fetch array size of 5,000 was attempted.  Accompanying the decrease in the execution time is a decrease in the number of consistent gets, bytes sent across the network, and the number of round trips.  Nice, so where is the twist, is it just that the time increased when the fetch array size was increased to 5,000?

Maybe we should repeat the test on Oracle Database 11.2.0.1, which also used the default 8KB block size and runs on 64 bit Linux.  These are the summarized results, excluding the output that did not change significantly from the test run on Oracle 11.1.0.6:

           Fetch Array:  1            10           50          100          200          500          1000         2000         5000
Elapsed             00:05:39.97  00:02:16.71  00:01:36.36  00:01:27.57  00:01:24.48  00:01:22.77  00:01:22.15  00:01:21.96  00:01:22.24
consistent gets         516,378      116,378       36,378       26,378       21,378       18,378       17,378       16,878       16,578

When the fetch array size was specified as 1, the number of consistent gets dropped from 12,554,387 to 516,378 when run on Oracle 11.2.0.1, yet the execution time is almost identical to that achieved on 11.1.0.6.  When the fetch array size was specified as 5,000, the number of consistent gets dropped from 8,559,626 to 16,578, yet the execution time was almost identical to that achieved on 11.1.0.6.  Quite a substantial decrease in the number of consistent gets from one release to another.

Anyone want to take an educated guess as to what caused the decrease in the number of consistent gets?

Take a guess before scrolling down.

Here is the DBMS_XPLAN output for Oracle 11.1.0.6 for the SQL statement:

SQL_ID  7v7q3k01y4r0z, child number 0
-------------------------------------
SELECT /*+ USE_NL(T51 T52 T53 T54 T55 T56 T57 T58) */    T51.C1,   
T51.C2,    T52.C1,    T52.C2,    T53.C1,    T53.C2,    T54.C1,   
T54.C2,    T55.C1,    T55.C2,    T56.C1,    T56.C2,    T57.C1,   
T57.C2,    T58.C1,    T58.C2  FROM    T5 T51,    T5 T52,    T5 T53,   
T5 T54,    T5 T55,    T5 T56,    T5 T57,    T5 T58  WHERE   
T51.C1=T52.C1    AND T51.C1=T53.C1    AND T51.C1=T54.C1    AND
T51.C1=T55.C1    AND T51.C1=T56.C1    AND T51.C1=T57.C1    AND
T51.C1=T58.C1    AND T51.C1 BETWEEN 'A 0000000' AND 'A 1000000'

Plan hash value: 2422939766

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS                       |              |      1 |        |   1000K|00:00:37.09 |    8696K|
|   2 |   NESTED LOOPS                      |              |      1 |   1000K|   1000K|00:00:34.08 |    7696K|
|   3 |    NESTED LOOPS                     |              |      1 |   1000K|   1000K|00:00:31.07 |    7457K|
|   4 |     NESTED LOOPS                    |              |      1 |   1000K|   1000K|00:00:26.06 |    6218K|
|   5 |      NESTED LOOPS                   |              |      1 |   1000K|   1000K|00:00:21.05 |    4980K|
|   6 |       NESTED LOOPS                  |              |      1 |   1000K|   1000K|00:00:16.04 |    3741K|
|   7 |        NESTED LOOPS                 |              |      1 |   1000K|   1000K|00:00:11.03 |    2503K|
|   8 |         NESTED LOOPS                |              |      1 |   1000K|   1000K|00:00:06.02 |    1264K|
|*  9 |          TABLE ACCESS FULL          | T5           |      1 |   1000K|   1000K|00:00:00.01 |   25906 |
|  10 |          TABLE ACCESS BY INDEX ROWID| T5           |   1000K|      1 |   1000K|00:00:04.28 |    1238K|
|* 11 |           INDEX UNIQUE SCAN         | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.96 |     238K|
|  12 |         TABLE ACCESS BY INDEX ROWID | T5           |   1000K|      1 |   1000K|00:00:03.80 |    1238K|
|* 13 |          INDEX UNIQUE SCAN          | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.63 |     238K|
|  14 |        TABLE ACCESS BY INDEX ROWID  | T5           |   1000K|      1 |   1000K|00:00:03.79 |    1238K|
|* 15 |         INDEX UNIQUE SCAN           | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.62 |     238K|
|  16 |       TABLE ACCESS BY INDEX ROWID   | T5           |   1000K|      1 |   1000K|00:00:03.78 |    1238K|
|* 17 |        INDEX UNIQUE SCAN            | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.60 |     238K|
|  18 |      TABLE ACCESS BY INDEX ROWID    | T5           |   1000K|      1 |   1000K|00:00:03.79 |    1238K|
|* 19 |       INDEX UNIQUE SCAN             | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.60 |     238K|
|  20 |     TABLE ACCESS BY INDEX ROWID     | T5           |   1000K|      1 |   1000K|00:00:03.79 |    1238K|
|* 21 |      INDEX UNIQUE SCAN              | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.62 |     238K|
|* 22 |    INDEX UNIQUE SCAN                | SYS_C0015042 |   1000K|      1 |   1000K|00:00:01.61 |     238K|
|  23 |   TABLE ACCESS BY INDEX ROWID       | T5           |   1000K|      1 |   1000K|00:00:01.54 |    1000K|
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   9 - filter(("T51"."C1">='A 0000000' AND "T51"."C1"<='A 1000000'))
  11 - access("T51"."C1"="T52"."C1")
       filter(("T52"."C1">='A 0000000' AND "T52"."C1"<='A 1000000'))
  13 - access("T51"."C1"="T53"."C1")
       filter(("T53"."C1">='A 0000000' AND "T53"."C1"<='A 1000000'))
  15 - access("T51"."C1"="T54"."C1")
       filter(("T54"."C1">='A 0000000' AND "T54"."C1"<='A 1000000'))
  17 - access("T51"."C1"="T55"."C1")
       filter(("T55"."C1">='A 0000000' AND "T55"."C1"<='A 1000000'))
  19 - access("T51"."C1"="T56"."C1")
       filter(("T56"."C1">='A 0000000' AND "T56"."C1"<='A 1000000'))
  21 - access("T51"."C1"="T57"."C1")
       filter(("T57"."C1">='A 0000000' AND "T57"."C1"<='A 1000000'))
  22 - access("T51"."C1"="T58"."C1")
       filter(("T58"."C1">='A 0000000' AND "T58"."C1"<='A 1000000'))

Here is the DBMS_XPLAN output for Oracle 11.2.0.1 for the SQL statement:

SQL_ID  abk386qc1xwkg, child number 0
-------------------------------------
SELECT /*+ USE_NL(T51 T52 T53 T54 T55 T56 T57 T58) */    T51.C1,   
T51.C2,    T52.C1,    T52.C2,    T53.C1,    T53.C2,    T54.C1,   
T54.C2,    T55.C1,    T55.C2,    T56.C1,    T56.C2,    T57.C1,   
T57.C2,    T58.C1,    T58.C2  FROM    T5 T51,    T5 T52,    T5 T53,   
T5 T54,    T5 T55,    T5 T56,    T5 T57,    T5 T58  WHERE   
T51.C1=T52.C1    AND T51.C1=T53.C1    AND T51.C1=T54.C1    AND
T51.C1=T55.C1    AND T51.C1=T56.C1    AND T51.C1=T57.C1    AND
T51.C1=T58.C1    AND T51.C1 BETWEEN 'A 0000000' AND 'A 1000000'

Plan hash value: 2002323537

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |   1000K|00:00:00.50 |   26055 |
|*  1 |  TABLE ACCESS FULL| T5   |      1 |   1000K|   1000K|00:00:00.50 |   26055 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("T51"."C1">='A 0000000' AND "T51"."C1"<='A 1000000'))

Oracle ignored my hint – darn bugs.  Is that the twist?  :-)


Actions

Information

9 responses

16 03 2010
Gary

Revisiting blocks would be my guess. If Oracle can get all the information it needs in a single fetch, it only needs to grab the block once. With a small ARRAYSIZE it will need to go back to the same block several times.

17 03 2010
Charles Hooper

Hi Gary,

Thanks for stopping by and providing a comment. I was thinking the same thing regarding the changes to the ARRAYSIZE setting. My Faulty Quotes blog article about the buffer cache hit ratio metric references this document and this OTN thread where Jonathan Lewis described how Oracle is able to grab a block and pin it when the fetch array size is increased, thus reducing the consistent gets statistic.

When I first saw Oracle 11.2.0.1’s consistent gets statistic for this query, I immediately thought that maybe Oracle 11.2.0.1 had greatly improved the block pinning behavior. I was surprise, shocked, maybe even a little impressed when I saw that the 8 way self-join was collapsed into a single full table scan.

18 03 2010
Gary

It seems to fit case#2 in Christian Antognini’s post

http://antognini.ch/2010/01/join-elimination/

Maybe I’d get used to it if I was using 11gR2 as a primary platform.

28 04 2010
coskan

I think it can also be nice to add the single table FTS access consistent gets formula in terms of relationship of arraysize number of blocks and rows

consistent gets are more or less equal to the formula below for a single full table scan (depends on the tablespace management is auto or manual )

(NUM_ROWS / ARRAYSIZE)+NUM_BLOCKS.

Like Tom Kyte and “>Vivek Sharma explained

I am not sure if it can be formulated for joins but it is nice to know the formula

28 04 2010
Charles Hooper

Coskan,

Thank you for sharing the formula. That formula is either very close or exactly right – I cannot tell at the moment because I am showing that the table in this database contains 16217 blocks, which based on the formula you provided, suggests that the actual number of blocks in the original test table should be about 16,378 (this is possible, but I cannot check right now).

For those that decide to test the values shown for the fetch array size of 1, keep in mind that the article states: “As you can see from the above, when the fetch array size was set to 1, SQL*Plus actually operated as if the fetch array size was set to 2.”

Thank you for contributing to the article.

19 03 2010
Charles Hooper

Hi Gary,

Thanks for providing the link – I agree, his case #2 explains very clearly what happened during my test run on 11.2.0.1. I thought that the feature was called table elimination (where a parent table could be eliminated if a foreign key constraint is present and no columns from the parent are returned), but the 10053 trace shows that it is correctly referred to as join elimination.

It is impressive to see that this join elimination works with self joins, but I wonder if it would still work if there were different WHERE clause predicates for each of the aliases – for instance an EMPLOYEES table that lists the employee’s supervisor in one column, so a self join would be needed to determine for all of the employees, which employees they supervise. It might be something interesting to explore.

23 03 2010
Fetch as Much as You Can « I'm just a simple DBA on a complex production system

[...] Hooper blogged last week with a new reason to use large fetch sizes: Turns out that it allows Oracle to pin the block for the length of the fetch. He references an OTN [...]

9 04 2010
Donatello Settembrino

Hi Charles,
I read your interesting test that I came immediately to mind
a post by Christian Antognini (where to take part on and also mentioned by Gary)
that the “join elimination”. I ran your test
my DB version 11.2.0.1.0, getting your own results

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000K|   105M|  2717   (2)| 00:00:39 |
|*  1 |  TABLE ACCESS FULL| T5   |  1000K|   105M|  2717   (2)| 00:00:39 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T51"."C1">='A 0000000' AND "T51"."C1"
 
alter table t5 drop constraint SYS_C00807645;

obtain the following execution plan .. hint applies when
the constraint no longer exists ….identical to what you got with version 11.1.0.6 ….

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |  1000K|   846M|    19G  (2)|999:59:59 |
|   1 |  NESTED LOOPS            |      |  1000K|   846M|    19G  (2)|999:59:59 |
|   2 |   NESTED LOOPS           |      |  1000K|   741M|    16G  (2)|999:59:59 |
|   3 |    NESTED LOOPS          |      |  1000K|   635M|    13G  (2)|999:59:59 |
|   4 |     NESTED LOOPS         |      |  1000K|   529M|    10G  (2)|999:59:59 |
|   5 |      NESTED LOOPS        |      |  1000K|   423M|  8145M  (2)|999:59:59 |
|   6 |       NESTED LOOPS       |      |  1000K|   317M|  5430M  (2)|999:59:59 |
|   7 |        NESTED LOOPS      |      |  1000K|   211M|  2715M  (2)|999:59:59 |
|*  8 |         TABLE ACCESS FULL| T5   |  1000K|   105M|  2717   (2)| 00:00:39 |
|*  9 |         TABLE ACCESS FULL| T5   |     1 |   111 |  2715   (2)| 00:00:39 |
|* 10 |        TABLE ACCESS FULL | T5   |     1 |   111 |  2715   (2)| 00:00:39 |
|* 11 |       TABLE ACCESS FULL  | T5   |     1 |   111 |  2715   (2)| 00:00:39 |
|* 12 |      TABLE ACCESS FULL   | T5   |     1 |   111 |  2715   (2)| 00:00:39 |
|* 13 |     TABLE ACCESS FULL    | T5   |     1 |   111 |  2715   (2)| 00:00:39 |
|* 14 |    TABLE ACCESS FULL     | T5   |     1 |   111 |  2715   (2)| 00:00:39 |
|* 15 |   TABLE ACCESS FULL      | T5   |     1 |   111 |  2715   (2)| 00:00:39 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   8 - filter("T51"."C1">='A 0000000' AND "T51"."C1"='A 0000000' AND "T52"."C1"='A 0000000' AND "T53"."C1"='A 0000000' AND "T54"."C1"='A 0000000' AND "T55"."C1"='A 0000000' AND "T56"."C1"='A 0000000' AND "T57"."C1"='A 0000000' AND "T58"."C1"<='A 1000000' AND
              "T51"."C1"="T58"."C1")

I would like to ask a question instead on logical reads.
I re-run your tests twice, the first with ArraySize default (if I remember correctly is equal to 15) and the second ArraySize 200.

select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for IBM/AIX RISC System/6000: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

With arraysize 15 (default)


Elapsed: 00:14:41.71

Statistiche
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      74776  consistent gets
          0  physical reads
          0  redo size
   89664222  bytes sent via SQL*Net to client
     466899  bytes received via SQL*Net from client
      66668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000000  rows processed

With arraysize 200

Elapsed: 00:02:52.76

Statistiche
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      13118  consistent gets
          0  physical reads
          0  redo size
   86271849  bytes sent via SQL*Net to client
      35230  bytes received via SQL*Net from client
       5001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000000  rows processed

whereas in tests carried my table has the following features:….

select num_rows, blocks, trunc(num_rows/blocks) row_for_block 
from user_tables
where table_name = 'T5';

  NUM_ROWS     BLOCKS     ROW_FOR_BLOCK
---------- ---------- -------------
   1000000       8107           123

I do not get significant benefits by increasing further ArraySize(in my case 200), in fact if I set arraysize at a number greater than the number of rows stored in Each single table blocks, the number of logical reads is close to the number of the table's
blocks… right?

Thanks a lot

Regards

Donatello Settembrino

9 04 2010
Charles Hooper

Donatello,

Thanks for contributing your test case in this blog article. It is interesting that removing the unique contraint affects the execution plan.

My test showed that there was only about a 7 second time difference between the fetch array size of 200 and the fetch array size of 2000 (01:29.87 – 01:22.25) on Oracle 11.2.0.1. On Oracle 11.1.0.6 the difference was only about 2.5 seconds (01:24.48 – 01:21.96). I noticed that my times were a bit faster than were your times – I wonder if network performance might be a limiting factor that causes your execution time to level off with a fetch array size of 200. If the client computer was also the server computer (no network involved) it could be that you reached the server’s memory throughput/transfers per second threshold.

I am not sure what the correct answer is for your question. The answer might depend on the execution plan – whether there are nested loop joins, hash joins, etc. By reviewing my summarized test case output, the number of consistent gets perform continued to decrease as the array fetch size increased.

You might take a look at the comment made in this thread by Jonathan Lewis:

http://forums.oracle.com/forums/thread.jspa?threadID=973560&start=0&tstart=0

“If the query really is nothing but a UNION ALL of five tablescans, then what it the array fetch size ? If the array fetch size is large enough the number of block visits would be similar to the number of physical block reads. If the arraysize is one then the number of block visits could be close to the number of rows fetched as a block has to be released at the end of each fetch and the reacquired for the next fetch.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 143 other followers

%d bloggers like this: