Will Enabling a 10046 Trace Make My Query Slower (or Faster)?

13 02 2012

February 13, 2012 (Modified February 14, 2012)

It seems that I often wonder about the things that I read… is it really true?  Several years ago, one of the items that I wondered about is whether or not an execution plan for a SQL statement could change magically simply because the Oracle RDBMS knows that it is being watched (a bit short-sighted, but stay with me for a couple of minutes).  When you first learn to read raw 10046 extended SQL trace files, you might also start to wonder whether or not enabling a 10046 trace could make the Oracle RDBMS suddenly start to behave differently.

Let’s take a quick look at a small portion of a 11.2.0.2 trace file:

PARSING IN CURSOR #371254768 len=53 dep=0 uid=62 oct=3 lid=62 tim=184435767875 hv=1334288504 ad='7ffb8c1be40' sqlid='a75xc757sg83s'
SELECT /*+ INDEX(T3) */
  *
FROM
  T3
WHERE
  C1<=200
END OF STMT
PARSE #371254768:c=0,e=699,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=216260220,tim=184435767874
EXEC #371254768:c=0,e=25,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=216260220,tim=184435767947
WAIT #371254768: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=0 tim=184435767976
WAIT #371254768: nam='db file scattered read' ela= 320 file#=4 block#=4145624 blocks=8 obj#=73036 tim=184435768358
WAIT #371254768: nam='db file scattered read' ela= 276 file#=4 block#=4145616 blocks=8 obj#=73035 tim=184435768704
FETCH #371254768:c=0,e=764,p=16,cr=2,cu=0,mis=0,r=1,dep=0,og=1,plh=216260220,tim=184435768757 

In the above, notice the mis=1 entry on the PARSE #371254768 line – that entry indicates a hard parse, a re-optimization of the SQL statement by the query optimizer.  If this SQL statement used bind variables, a feature known as bind variable peeking (introduced in Oracle 9.0.1) would take place during re-optimization, which might provide the optimizer with more information (with the help of a histogram, for instance) and potentially result in a change to the SQL statement’s execution plan.

(A partially related note: I just recently placed an order for the book “Oracle Database 11gR2 Performance Tuning Cookbook“, and while taking a quick peek at two or three pages in the book’s preview on Amazon, I found an error related to the above mentioned feature.)

Let’s take a look at a quick example of the above situation.  We will start by creating a table, inserting 100,000 rows where 20% of the rows have the value 0 in column C2, creating an index on column C2, and gathering statistics with histograms:

CREATE TABLE T5(
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 VARCHAR2(300) NOT NULL);

INSERT INTO
  T5
SELECT
  ROWNUM C1,
  DECODE(MOD(ROWNUM,5),0,0,ROWNUM) C2,
  RPAD('A',300,'A') C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

CREATE INDEX IND_T5_C2 ON T5(C2);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>NULL,TABNAME=>'T5',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL COLUMNS SIZE 254') 

Now for the first segment of the test script, where a single row will be selected from the table.  To save space, I will not show the output from the select of the table T5, but will show the DBMS_XPLAN output:

ALTER SESSION SET TRACEFILE_IDENTIFIER = '10046TraceFindMe';

SET LINESIZE 120
SET PAGESIZE 1000
VAR V2 NUMBER
EXEC :V2:=4

SELECT
  C1
FROM
  T5
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  cmm43s6usjbk1, child number 0
-------------------------------------
SELECT   C1 FROM   T5 WHERE   C2=:V2

Plan hash value: 3443248787

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2) 

An index range scan, that is reasonable.  Let’s change the value of the bind variable so that 20% of the rows from the table will be selected, and take a look at the execution plan:

EXEC :V2:=0

SELECT
  C1
FROM
  T5
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  cmm43s6usjbk1, child number 0
-------------------------------------
SELECT   C1 FROM   T5 WHERE   C2=:V2

Plan hash value: 3443248787

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2) 

Notice in the above that the child number is still 0 and that the execution plan remained the same, even though we requested the retrieval of 20% of the rows from table T5.  In Oracle Database 11.1 and above, adaptive cursor sharing might eventually step in to change the execution plan for this bind variable value to use a full table scan, but that change could not happen on the first execution.

Now, we will enable a 10046 trace, set the bind variable value to the second value listed above, and return the execution plan:

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

EXEC :V2:=0

SELECT
  C1
FROM
  T5
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  cmm43s6usjbk1, child number 1
-------------------------------------
SELECT   C1 FROM   T5 WHERE   C2=:V2

Plan hash value: 2002323537

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   358 (100)|          |
|*  1 |  TABLE ACCESS FULL| T5   | 19685 |   192K|   358   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2) 

Notice in the above output that while the SQL_ID has remained constant, the child number is now 1, and the execution plan changed to a full table scan.  Bind variable peeking and the histogram on column C2 provided the optimizer with enough information to result in an execution plan change.

Let’s switch back to the original bind variable value, which will result in a single row being returned:

SELECT
  C1
FROM
  T5
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  cmm43s6usjbk1, child number 1
-------------------------------------
SELECT   C1 FROM   T5 WHERE   C2=:V2

Plan hash value: 2002323537

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   358 (100)|          |
|*  1 |  TABLE ACCESS FULL| T5   | 19685 |   192K|   358   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2) 

Note that child cursor 1 is still used – a full table scan where an index range scan was used previous to enabling a 10046 trace.  So, while we have not examined execution times, we have now seen cases where a query could potentially execute faster or slower due to a change in the execution plan after a 10046 extended SQL trace is enabled.

Let’s try one more time with the 10046 trace still enabled:

SELECT
  C1
FROM
  T5
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL')); 

SQL_ID  cmm43s6usjbk1, child number 2
-------------------------------------
SELECT   C1 FROM   T5 WHERE   C2=:V2

Plan hash value: 3443248787

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Now the child cursor number is displayed as 2, and the execution plan returned to an index range scan.  Adaptive cursor sharing kicked in and caused a re-optimization of the SQL statement.

Let’s disable the 10046 extended SQL trace:

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

A quick look at a small (marked up) portion of the raw 10046 extended SQL trace file – the first execution with the 10046 trace enabled:

PARSING IN CURSOR #429376080 len=20 dep=0 uid=62 oct=47 lid=62 tim=11461702777 hv=1269726367 ad='7ffb76adba8' sqlid='5dxda7t5uwz4z'
BEGIN :V2:=0; END;
END OF STMT
PARSE #429376080:c=0,e=56,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=11461702777
WAIT #429376080: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11461702853
EXEC #429376080:c=0,e=53,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=0,tim=11461702867
WAIT #429376080: nam='SQL*Net message from client' ela= 1401 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11461704296
CLOSE #429376080:c=0,e=12,dep=0,type=0,tim=11461704325
=====================
PARSING IN CURSOR #429376080 len=36 dep=0 uid=62 oct=3 lid=62 tim=11461704495 hv=3045633601 ad='7ffb77363b0' sqlid='cmm43s6usjbk1'
SELECT
  C1
FROM
  T5
WHERE
  C2=:V2
END OF STMT
PARSE #429376080:c=0,e=152,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=11461704494   <--- Hard Parse
EXEC #429376080:c=0,e=784,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=2002323537,tim=11461705325
WAIT #429376080: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11461705357
FETCH #429376080:c=0,e=35,p=0,cr=3,cu=0,mis=0,r=1,dep=0,og=1,plh=2002323537,tim=11461705408
...
WAIT #429376080: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11463556616
FETCH #429376080:c=0,e=21,p=0,cr=1,cu=0,mis=0,r=4,dep=0,og=1,plh=2002323537,tim=11463556630
STAT #429376080 id=1 cnt=20000 pid=0 pos=1 obj=73243 op='TABLE ACCESS FULL T5 (cr=5813 pr=0 pw=0 time=15413 us cost=358 size=196850 card=19685)'
... 

In the above, notice the mis=1 entry on the PARSE #429376080 line – the optimizer hard parsed the SQL statement.  Moving down a bit in the trace file, the second execution with the changed bind variable value:

PARSING IN CURSOR #429376080 len=20 dep=0 uid=62 oct=47 lid=62 tim=11505336665 hv=1039981032 ad='7ffb873a408' sqlid='857u964yztqg8'
BEGIN :V2:=4; END;
END OF STMT
PARSE #429376080:c=0,e=93,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=11505336664
WAIT #429376080: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11505336754
EXEC #429376080:c=0,e=63,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=0,tim=11505336769
WAIT #429376080: nam='SQL*Net message from client' ela= 2382 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11505339191
CLOSE #429376080:c=0,e=13,dep=0,type=0,tim=11505339229
=====================
PARSING IN CURSOR #429376080 len=36 dep=0 uid=62 oct=3 lid=62 tim=11505339288 hv=3045633601 ad='7ffb77363b0' sqlid='cmm43s6usjbk1'
SELECT
  C1
FROM
  T5
WHERE
  C2=:V2
END OF STMT
PARSE #429376080:c=0,e=39,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2002323537,tim=11505339288   <--- NO Hard Parse
EXEC #429376080:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2002323537,tim=11505339355
WAIT #429376080: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=11505339381
FETCH #429376080:c=0,e=296,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2002323537,tim=11505339693
... 

In the above, notice the mis=0 entry on the PARSE #429376080 line – the optimizer did NOT re-optimize this SQL statement, so the execution plan remained a full table scan.  Now the final section that was influenced by adaptive cursor sharing:

PARSING IN CURSOR #429376080 len=36 dep=0 uid=62 oct=3 lid=62 tim=12566538414 hv=3045633601 ad='7ffb77363b0' sqlid='cmm43s6usjbk1'
SELECT
  C1
FROM
  T5
WHERE
  C2=:V2
END OF STMT
PARSE #429376080:c=0,e=55,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2002323537,tim=12566538414   <--- NO Hard Parse
EXEC #429376080:c=0,e=1594,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=3443248787,tim=12566540065  <--- Hard Parse
WAIT #429376080: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=532 tim=12566540198
FETCH #429376080:c=0,e=29,p=0,cr=3,cu=0,mis=0,r=1,dep=0,og=1,plh=3443248787,tim=12566540245 

In the above, notice the mis=0 entry on the PARSE #429376080 line – the optimizer did NOT re-optimize this SQL statement during the parse phase.  However, remember that the cursor number changed from the previous value of 1 to 2; the above EXEC #429376080 contains mis=1, so a hard parse was performed during the execution phase.

You might be thinking that maybe the execution plans can only change if we use bind variables when histograms are present.  Let’s try another experiment.  First, we will create a table, create an index on that table, insert 100,000 rows with the same characteristics as those found in table T5, and NOT collect statistics:

CREATE TABLE T6 (
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 VARCHAR2(300) NOT NULL);

CREATE INDEX IND_T6_C2 ON T6(C2);

INSERT INTO
  T6
SELECT
  ROWNUM C1,
  DECODE(MOD(ROWNUM,5),0,0,ROWNUM) C2,
  RPAD('A',300,'A') C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT; 

Let’s repeat the previous test case:

SET LINESIZE 120
SET PAGESIZE 1000
VAR V2 NUMBER
EXEC :V2:=4

SELECT
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  9krhnbpjpvpwh, child number 0
-------------------------------------
SELECT   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 4242288932

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     1 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2) 

The above shows that dynamic sampling was used during the hard parse, and as expected an index range scan was performed.  Part 2 of the previous test script:

EXEC :V2:=0

SELECT
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  9krhnbpjpvpwh, child number 0
-------------------------------------
SELECT   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 4242288932

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     1 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2) 

Again, nothing new – no hard parse when the bind variable was changed, and so the index was still used.  Part 3 of the previous test script:

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

EXEC :V2:=0

SELECT
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  9krhnbpjpvpwh, child number 1
-------------------------------------
SELECT   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 4242288932

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |   142 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        | 34140 |   866K|   142   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 | 34140 |       |   142   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2) 

There was a hard parse, as is noted by the change in the child cursor number, yet the optimizer still decided that it was most efficient to perform an index range scan.  The optimizer estimated that 34,140 rows would be returned with an estimated size of 866KB.  You might recall that earlier the same test for table T5 showed the following execution plan:

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   358 (100)|          |
|*  1 |  TABLE ACCESS FULL| T5   | 19685 |   192K|   358   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2) 

With a histogram (and without dynamic sampling), the optimizer estimated that 19,685 rows would be returned with an estimated size of 192KB – and in that case a full table scan was performed.  But why was a full table scan not performed when experimenting with dynamic sampling?  Is it possible that if the query optimizer uses dynamic sampling, that the execution plan is NOT subject to change when a 10046 extended SQL trace is enabled, or is there a simplier cause?

Let’s try executing the query with a FULL hint:

SELECT /*+ FULL(T6) */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  bhm179jq4vvca, child number 0
-------------------------------------
SELECT /*+ FULL(T6) */   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 1930642322

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   357 (100)|          |
|*  1 |  TABLE ACCESS FULL| T6   | 34140 |   866K|   357   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2) 

Well, that explains part of the issue.  While the calculated cost of the full table scan access path is 357 (very close to the 358 calculated for the same SQL statement that accessed table T5), the calculated cost of the index access path was only 142 – the lowest calculated cost access path wins.

What if we force an index access path for table T5 with the same value of 0 for the bind variable.  Will the calculated cost also be roughly 142?

SELECT /*+ INDEX(T5) */
  C1
FROM
  T5
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  5y34jm13up8xh, child number 0
-------------------------------------
SELECT /*+ INDEX(T5) */   C1 FROM   T5 WHERE   C2=:V2

Plan hash value: 3443248787

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |  1826 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        | 19685 |   192K|  1826   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 | 19685 |       |    43   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)  

Obviously, no – otherwise the index access path would have been selected for table T5 also – as shown above, the cost is actually 1826.

—-

The challenge (drifting a bit from the topic of this article): If dynamic sampling is used, and the optimizer knows that a many (possibly 34%) of the rows will be returned when the query is executed, why is an index access path used?  In other words, what causes the index access path to be calculated so cheaply when compared to the calculated cost for the same access path with table T5?  Is the above simply a bug found in Oracle Database 11.2.0.2, or is there a better explanation?

-

Added February 14, 2012:

One person (Narendra) noticed that there appeared to be a bug in the optimizer’s calculations.  If you closely examine the execution plans above, you just might notice that the Cost displayed on the INDEX RANGE SCAN operation’s line of the execution plan (142) is identical to the Cost displayed on its the parent operation TABLE ACCESS BY INDEX ROWID (142).  As is probably well known, a parent operation’s Cost, Buffers, Reads, and various other columns display their child statistics summed with the statistics that are directly associated with the parent operation.  Take another look at a portion of one of the earlier execution plans:

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |   142 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        | 34140 |   866K|   142   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 | 34140 |       |   142   (0)| 00:00:01 |
----------------------------------------------------------------------------------------- 

The cost displayed for the TABLE ACCESS BY INDEX ROWID operation should be greater than the cost displayed for the index access – unless the optimizer forgot that the table must be accessed when dynamic sampling is performed.  Oracle Database 11.2.0.1 and 11.2.0.2 appear to be affected by this particular bug, while Oracle Database 10.2.0.5 and 11.1.0.7 seem to under-estimate the cost of the table access when dynamic sampling is performed.  As such, if statistics on some of the tables (for instance temp tables) are locked without any statistics set on the tables (to force dynamic sampling), there could be a minor issue that becomes a more significant issue when upgrading to 11.2.0.1 or 11.2.0.2.

Recreating the same test table as earlier:

DROP TABLE T6 PURGE;

CREATE TABLE T6 (
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 VARCHAR2(300) NOT NULL);

CREATE INDEX IND_T6_C2 ON T6(C2);

INSERT INTO
  T6
SELECT
  ROWNUM C1,
  DECODE(MOD(ROWNUM,5),0,0,ROWNUM) C2,
  RPAD('A',300,'A') C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT; 

The first portion of the test:

ALTER SESSION SET TRACEFILE_IDENTIFIER = '10053NO10046TraceFindMe';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SET LINESIZE 120
SET PAGESIZE 1000
VAR V2 NUMBER
EXEC :V2:=4

SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 4242288932

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     1 (100)|      1 |00:00:00.01 |       4 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |      1 |     1   (0)|      1 |00:00:00.01 |       4 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |      1 |     1   (0)|      1 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 4

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2)

The second portion of the test:

EXEC :V2:=0

SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 4242288932

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     1 (100)|  20000 |00:00:00.06 |    6981 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |      1 |     1   (0)|  20000 |00:00:00.06 |    6981 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |      1 |     1   (0)|  20000 |00:00:00.03 |    1410 |
----------------------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 4

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2) 

Notice in the above that the execution plan has not changed, and the peeked bind variable value is still 4 (this is expected).

The third portion of the test:

ALTER SESSION SET TRACEFILE_IDENTIFIER = '1005310046TraceFindMe';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

EXEC :V2:=0

SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

SQL_ID  bxnurhc1nmq2g, child number 1
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 4242288932

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |    49 (100)|  20000 |00:00:00.06 |    6944 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  18399 |    49   (0)|  20000 |00:00:00.06 |    6944 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |  18399 |    49   (0)|  20000 |00:00:00.03 |    1373 |
----------------------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2) 

Notice in the above that the Cost displayed for the INDEX RANGE SCAN operation is identical to the Cost displayed for the TABLE ACCESS BY INDEX ROWID – also notice the low estimated Cost for the index access (this might be a reasonable value, as will be shown below).

The fourth portion of the test, checking the cost of the full table scan access path:

SELECT /*+ FULL(T6) GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

Plan hash value: 1930642322

-------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |  1579 (100)|  20000 |00:00:00.05 |    5815 |
|*  1 |  TABLE ACCESS FULL| T6   |      1 |  18399 |  1579   (5)|  20000 |00:00:00.05 |    5815 |
-------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2)

Note
-----
   - dynamic sampling used for this statement (level=2)

So, the calculated cost of the full table scan access path is quite high compared to the cost of the index access path.

Let’s try an undocumented OPT_ESTIMATE hint to see what happens:

SELECT /*+ OPT_ESTIMATE(TABLE, T6, ROWS=20000) GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

Plan hash value: 4242288932

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     1 (100)|  20000 |00:00:00.06 |    6944 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  20000 |     1   (0)|  20000 |00:00:00.06 |    6944 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |      1 |     1   (0)|  20000 |00:00:00.02 |    1373 |
----------------------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Now the calculated Cost of the index access path is just 1 – bad idea for a solution.

Let’s collect statistics on the table with a histogram on column C2 to see what happens:

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>NULL,TABNAME=>'T6',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR COLUMNS C2 SIZE 254',NO_INVALIDATE=>FALSE)

SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2

Plan hash value: 1930642322

-------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |  1582 (100)|  20000 |00:00:00.05 |    5814 |
|*  1 |  TABLE ACCESS FULL| T6   |      1 |  19685 |  1582   (5)|  20000 |00:00:00.05 |    5814 |
-------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2)

The optimizer now selected to use a full table scan, rather than an index access path.

Let’s check the cost of the index access path:

SELECT /*+ INDEX(T6) GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));

SQL_ID  4nqwgus0ghxmc, child number 0
-------------------------------------
SELECT /*+ INDEX(T6) GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE
C2=:V2

Plan hash value: 4242288932

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |  1847 (100)|  20000 |00:00:00.07 |    6944 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  19685 |  1847   (2)|  20000 |00:00:00.07 |    6944 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |  19685 |    44  (12)|  20000 |00:00:00.02 |    1373 |
----------------------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

-

The results of the test from 10.2.0.5:

The first portion of the test:

ALTER SESSION SET TRACEFILE_IDENTIFIER = '10053NO10046TraceFindMe';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
 
SET LINESIZE 120
SET PAGESIZE 1000
VAR V2 NUMBER
EXEC :V2:=4
 
SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     2 (100)|      1 |00:00:00.01 |       4 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |      1 |     2   (0)|      1 |00:00:00.01 |       4 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |      1 |     1   (0)|      1 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 4
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)
 
Note
-----
   - dynamic sampling used for this statement

Notice in the above output that the Cost displayed for the TABLE ACCESS BY INDEX ROWID operation is 2, while on 11.2.0.2 it was 1.

The second portion of the test script on 10.2.0.5:

EXEC :V2:=0
  
SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     2 (100)|  20000 |00:00:00.04 |    6969 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |      1 |     2   (0)|  20000 |00:00:00.04 |    6969 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |      1 |     1   (0)|  20000 |00:00:00.01 |    1407 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 4
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)
 
Note
-----
   - dynamic sampling used for this statement

No change in the execution plan, as expected.

The third portion of the script on 10.2.0.5:

ALTER SESSION SET TRACEFILE_IDENTIFIER = '1005310046TraceFindMe';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
  
EXEC :V2:=0
 
SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
SQL_ID  bxnurhc1nmq2g, child number 1
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |   209 (100)|  20000 |00:00:00.02 |    6932 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  20556 |   209   (0)|  20000 |00:00:00.02 |    6932 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |  20556 |    49   (0)|  20000 |00:00:00.01 |    1370 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)
 
Note
-----
   - dynamic sampling used for this statement

Notice in the output above the that while the cost of the TABLE ACCESS BY INDEX ROWID operation is significantly higher than the INDEX RANGE SCAN child operation, the Cost was not high enough to result in a full table scan.

Checking the calculated cost of a full table scan:

SELECT /*+ FULL(T6) GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
SQL_ID  bk8rwkf0ab8h5, child number 0
-------------------------------------
SELECT /*+ FULL(T6) GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 1930642322
 
-------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |   358 (100)|  20000 |00:00:00.02 |    5815 |
|*  1 |  TABLE ACCESS FULL| T6   |      1 |  20556 |   358   (1)|  20000 |00:00:00.02 |    5815 |
-------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2)
 
Note
-----
   - dynamic sampling used for this statement 

358 is greater than 209, so no full table scan without a hint.

OPT_ESTIMATE hint test:

SELECT /*+ OPT_ESTIMATE(TABLE, T6, ROWS=20000) GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST')); 
 
SQL_ID  bud5avdhnmssf, child number 0
-------------------------------------
SELECT /*+ OPT_ESTIMATE(TABLE, T6, ROWS=20000) GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     1 (100)|  20000 |00:00:00.04 |    6932 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  20000 |     1   (0)|  20000 |00:00:00.04 |    6932 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |      1 |     1   (0)|  20000 |00:00:00.01 |    1370 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

Gather statistics on 10.2.0.5 with a histogram on column C2, then repeat the test:

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>NULL,TABNAME=>'T6',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR COLUMNS C2 SIZE 254',NO_INVALIDATE=>FALSE) 
 
SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 1930642322
 
-------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |   358 (100)|  20000 |00:00:00.02 |    5815 |
|*  1 |  TABLE ACCESS FULL| T6   |      1 |  19685 |   358   (1)|  20000 |00:00:00.02 |    5815 |
-------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"=:V2)

The optimizer determined that a full table scan would be most efficient.

Let’s check the cost of the index access path on 10.2.0.5:

SELECT /*+ INDEX(T6) GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
SQL_ID  4nqwgus0ghxmc, child number 0
-------------------------------------
SELECT /*+ INDEX(T6) GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |  1822 (100)|  20000 |00:00:00.02 |    6932 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  19685 |  1822   (1)|  20000 |00:00:00.02 |    6932 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |  19685 |    39   (0)|  20000 |00:00:00.01 |    1370 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)

The Cost of 1822 is greater than the Cost of 358 for the full table scan access path, so that explains why the full table scan access path was selected.


Actions

Information

11 responses

14 02 2012
Narendra

Charles,

I must admit my initial (knee-jerk) response was this looks like a bug (or limitation) in optimizer calculation. Then I decided to try it out. I generated a 10053 trace to see what is going on. Well, there were clues there but I (again) jumped to incorrect conclusion. I thought maybe this is because CREATE INDEX statement has generated statistics for the index (although those would be stale after the data population) and hence dynamic sampling (at level 2) did not kick-in for the index and was used only for table access. Then, just to be sure, I delete index statistics and re-run the test. But, to my surprise, the results did not change. At this stage, I had another look at the 10053 trace generated earlier (when index stats were present) and I found that CBO actually executed 2 queries as a result of dynamic sampling, one forcing the full table scan and another forcing the index scan. I am not sure about how it derived the index selectivity (in my case 0.203927) but I guess that is the reason index access is being costed as cheaper compared to full table scan.
So I am back to my first response….it is the bug (or limitation) of the optimizer.

p.s. BTW, I ran those tests on “Oracle Developer Days” virtual machine DB and my “row” and “bytes” figures are quite different to your figures for index access path.

14 02 2012
Charles Hooper

Narendra,

I added some additional information at the end of the blog article. The problem appears to be two-fold:
* The optimizer is not calculating the cost of the table access as part of the index access path.
* While the optimizer’s estimates are close, it is calculating a very small index access cost when dynamic sampling is used.

The second dynamic sampling query appears to be attempting to determine the number of NULL values, even though all of the columns are declared as NOT NULL:
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS opt_param(‘parallel_execution_enabled’, ‘false’) NO_PARALLEL(SAMPLESUB) NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0) FROM (SELECT /*+ NO_PARALLEL(“T6″) INDEX(“T6″ IND_T6_C2) NO_PARALLEL_INDEX(“T6″) */ 1 AS C1, 1 AS C2, 1 AS C3 FROM “TESTUSER”.”T6″ “T6″ WHERE “T6″.”C2″=:B1 AND ROWNUM <= 2500) SAMPLESUB

The full dynamic sampling section from my 11.2.0.2 (most recent) 10053 trace:

** Performing dynamic sampling initial checks. **
  Column (#2): C2(  NO STATISTICS (using defaults)
    AvgLen: 13 NDV: 3 Nulls: 0 Density: 0.390244
** Dynamic sampling initial checks returning TRUE (level = 2).
** Dynamic sampling updated index stats.: IND_T6_C2, blocks=244
** Dynamic sampling index access candidate : IND_T6_C2
** Dynamic sampling updated table stats.: blocks=4528

*** 2012-02-14 17:41:26.512
** Generated dynamic sampling query:
    query text : 
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("T6") FULL("T6") NO_PARALLEL_INDEX("T6") */ 1 AS C1, CASE WHEN "T6"."C2"=:B1 THEN 1 ELSE 0 END AS C2 FROM "TESTUSER"."T6" SAMPLE BLOCK (1.391343 , 1) SEED (1) "T6") SAMPLESUB
=====================
PARSING IN CURSOR #388723216 len=429 dep=1 uid=64 oct=3 lid=64 tim=189525484065 hv=2741378272 ad='3ead44e70' sqlid='2m4nm7fjqc770'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("T6") FULL("T6") NO_PARALLEL_INDEX("T6") */ 1 AS C1, CASE WHEN "T6"."C2"=:B1 THEN 1 ELSE 0 END AS C2 FROM "TESTUSER"."T6" SAMPLE BLOCK (1.391343 , 1) SEED (1) "T6") SAMPLESUB
END OF STMT
PARSE #388723216:c=0,e=421,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=189525484064
EXEC #388723216:c=0,e=712,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=2403212413,tim=189525484863
FETCH #388723216:c=0,e=476,p=0,cr=66,cu=0,mis=0,r=1,dep=1,og=1,plh=2403212413,tim=189525485360
STAT #388723216 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=66 pr=0 pw=0 time=475 us)'
STAT #388723216 id=2 cnt=1282 pid=1 pos=1 obj=102954 op='TABLE ACCESS SAMPLE T6 (cr=66 pr=0 pw=0 time=612 us cost=2 size=25 card=1)'

*** 2012-02-14 17:41:26.512
** Executed dynamic sampling query:
    level : 2
    sample pct. : 1.391343
    actual sample size : 1282
    filtered sample card. : 256
    orig. card. : 82
    block cnt. table stat. : 4528
    block cnt. for sampling: 4528
    max. sample block cnt. : 64
    sample block cnt. : 63
    min. sel. est. : 0.01000000
CLOSE #388723216:c=0,e=8,dep=1,type=0,tim=189525485505
** Using recursive dynamic sampling card. est. : 92141.206349

*** 2012-02-14 17:41:26.512
** Generated dynamic sampling query:
    query text : 
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS opt_param('parallel_execution_enabled', 'false') NO_PARALLEL(SAMPLESUB) NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0) FROM (SELECT /*+ NO_PARALLEL("T6") INDEX("T6" IND_T6_C2) NO_PARALLEL_INDEX("T6") */ 1 AS C1, 1 AS C2, 1 AS C3  FROM "TESTUSER"."T6" "T6" WHERE "T6"."C2"=:B1 AND ROWNUM <= 2500) SAMPLESUB
=====================
PARSING IN CURSOR #388723216 len=388 dep=1 uid=64 oct=3 lid=64 tim=189525485910 hv=1703431641 ad='3ed6b8ed0' sqlid='5c3hsnpkshmft'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS opt_param('parallel_execution_enabled', 'false') NO_PARALLEL(SAMPLESUB) NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0) FROM (SELECT /*+ NO_PARALLEL("T6") INDEX("T6" IND_T6_C2) NO_PARALLEL_INDEX("T6") */ 1 AS C1, 1 AS C2, 1 AS C3  FROM "TESTUSER"."T6" "T6" WHERE "T6"."C2"=:B1 AND ROWNUM <= 2500) SAMPLESUB
END OF STMT
PARSE #388723216:c=0,e=340,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=189525485909
EXEC #388723216:c=0,e=714,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=3369105306,tim=189525486686
FETCH #388723216:c=0,e=777,p=0,cr=7,cu=0,mis=0,r=1,dep=1,og=1,plh=3369105306,tim=189525487482
STAT #388723216 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=7 pr=0 pw=0 time=781 us)'
STAT #388723216 id=2 cnt=2500 pid=1 pos=1 obj=0 op='VIEW  (cr=7 pr=0 pw=0 time=517 us cost=1 size=9 card=1)'
STAT #388723216 id=3 cnt=2500 pid=2 pos=1 obj=0 op='COUNT STOPKEY (cr=7 pr=0 pw=0 time=516 us)'
STAT #388723216 id=4 cnt=2500 pid=3 pos=1 obj=102955 op='INDEX RANGE SCAN IND_T6_C2 (cr=7 pr=0 pw=0 time=390 us cost=1 size=13 card=1)'

*** 2012-02-14 17:41:26.512
** Executed dynamic sampling query:
    level : 2
    sample pct. : 100.000000
    actual sample size : 92141
    filtered sample card. : 2500
    filtered sample card. (index IND_T6_C2): 2500
    orig. card. : 92141
    block cnt. table stat. : 4528
    block cnt. for sampling: 4528
    max. sample block cnt. : 4294967295
    sample block cnt. : 4528
    min. sel. est. : 0.01000000
** Increasing dynamic sampling selectivity
   for predicate 0 from 0.027132 to 0.199688.
** Increasing dynamic sampling selectivity
   for predicate 1 from 0.027132 to 0.199688.
    index IND_T6_C2 selectivity est.: 0.19968799
CLOSE #388723216:c=0,e=7,dep=1,type=0,tim=189525487659
** Using dynamic sampling card. : 92141
** Dynamic sampling updated table card.
** Using single table dynamic sel. est. : 0.19968799
  Table: T6  Alias: T6
    Card: Original: 92141.206349  Rounded: 18399  Computed: 18399.49  Non Adjusted: 18399.49
  Access Path: TableScan
    Cost:  1579.31  Resp: 1579.31  Degree: 0
      Cost_io: 1511.00  Cost_cpu: 52517081
      Resp_io: 1511.00  Resp_cpu: 52517081
  Access Path: index (AllEqRange)
    Index: IND_T6_C2
    resc_io: 49.00  resc_cpu: 349151
    ix_sel: 0.199688  ix_sel_with_filters: 0.199688 
    Cost: 49.45  Resp: 49.45  Degree: 1
  Best:: AccessPath: IndexRange
  Index: IND_T6_C2
         Cost: 49.45  Degree: 1  Resp: 49.45  Card: 18399.49  Bytes: 0

I am still trying to work through the above information.
The second STAT line for the first dynamic sampling query begins like this:

STAT #388723216 id=2 cnt=1282

That probably explains this entry:

actual sample size : 1282

In the lines at the start of the dynamic sampling section is this entry:

** Dynamic sampling updated table stats.: blocks=4528

It appears that the above is carried through to these entries:

block cnt. table stat. : 4528
block cnt. for sampling: 4528

It is getting late here (and I am not making too much sense out of this 10053 trace), and I am having a bit of difficulty seeing where the 256 number (filtered sample card. : 256) is obtained, but:
256 / 1282 = 0.19968799
That is the same index selectivity that appears later:

index IND_T6_C2 selectivity est.: 0.19968799
16 02 2012
Randolf Geist

Hi Narendra,

> Then, just to be sure, I delete index statistics and re-run the test. But, to my surprise, the results did not change

I couldn’t reproduce this: When I delete the index stats I get an execution plan where the table access is costed. You still might end up with an index range scan since the optimizer doesn’t have a clue about the clustering factor of the index (this information is not determined by Dynamic Sampling), but the plan is definitely showing more realistic cost estimates and considers the table access.

So the optimizer seems to be confused by the existing index statistics and in this case simply omits the table access costs from the calculation, clearly a bug I think.

Reproduced on 11.2.0.2 and 11.2.0.3. Interestingly with index statistics deleted the table aceess is costed much higher when using 11.2.0.3 optimizer features which can lead to a plan change – and it did in my case switch to a full table scan whereas 11.2.0.2 still preferred the index range scan plan.

By the way: I believe the purpose of the additional query performed by Dynamic Sampling in case a suitable index is taking advantage of a very cheap way to get improved cardinality estimates for rare values. Therefore the query is hard-coded to be limited to 2,500 rows – if the query returns 2,500 rows, its result will be ignored. If it returns less than 2,500 rows then Oracle has a very precise cardinality / selectivity estimate by using the index. Without a suitable index the Dynamic Sampling code will refuse the result when the sample query returns 0 or only a few hits for the predicate assuming that the sample is not representative (too small sample size). Only with very high levels / sample block counts these results will not be rejected. Having an index in place therefore improves the Dynamic Sampling results when using (default) small samples.

Randolf

16 02 2012
Randolf Geist

> So the optimizer seems to be confused by the existing index statistics and in this case simply omits the table access costs from the calculation, clearly a bug I think.

I just thought about this again, and the clustering factor of the index is 0 in this case, so according to the formula for index range scan costs the table access part will be calculated as 0… So probably not a bug, but simply applying the general arithmetic rules.

Randolf

16 02 2012
Charles Hooper

Randolf,

Thanks for the comments – always helpful. I still on occasion have a bit of difficulty making sense of portions of 10053 trace files (and working with dynamic sampling).

The comment about the CLUSTERING_FACTOR seems to be relevant. This is from the 11.2.0.2 10053 trace file:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T6  Alias: T6  (NOT ANALYZED)
    #Rows: 82  #Blks:  1  AvgRowLen:  100.00  ChainCnt:  0.00
Index Stats::
  Index: IND_T6_C2  Col#: 2
    LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
Access path analysis for T6

This is the same section from the 10.2.0.5 trace file when executing exactly the same script:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T6  Alias: T6  (NOT ANALYZED)
    #Rows: 82  #Blks:  1  AvgRowLen:  100.00
Index Stats::
  Index: IND_T6_C2  Col#: 2    (NOT ANALYZED)
    LVLS: 1  #LB: 25  #DK: 100  LB/K: 1.00  DB/K: 1.00  CLUF: 800.00
***************************************

As I mentioned in the added section at the end of the blog article, 10.2.0.5 and 11.1.0.7 seem to be less negatively affected in this test case due to dynamic sampling. As shown above, 11.2.0.2 shows a CLUSTERING_FACTOR of 0 for the index, while 10.2.0.5 shows a CLUSTING_FACTOR of 800. The other index statistics on 11.2.0.2 are also displayed as 0, while that is not the case on 10.2.0.5.

After statistics collection on 11.2.0.2:

***********************
Table Stats::
  Table: T6  Alias: T6
    #Rows: 100000  #Blks:  4528  AvgRowLen:  310.00  ChainCnt:  0.00
Index Stats::
  Index: IND_T6_C2  Col#: 2
    LVLS: 1  #LB: 198  #DK: 80001  LB/K: 1.00  DB/K: 1.00  CLUF: 9052.00
Access path analysis for T6
***************************************

After statistics collection on 10.2.0.5:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T6  Alias: T6
    #Rows: 100000  #Blks:  4528  AvgRowLen:  310.00
Index Stats::
  Index: IND_T6_C2  Col#: 2
    LVLS: 1  #LB: 198  #DK: 80001  LB/K: 1.00  DB/K: 1.00  CLUF: 9052.00
***************************************

For completeness, this is the dynamic sampling section from the 10.2.0.5 10053 trace file:

** Performing dynamic sampling initial checks. **
  Column (#2): C2(NUMBER)  NO STATISTICS (using defaults)
    AvgLen: 13.00 NDV: 3 Nulls: 0 Density: 0.39024
** Dynamic sampling initial checks returning TRUE (level = 2).
** Dynamic sampling updated index stats.: IND_T6_C2, blocks=244
** Dynamic sampling index access candidate : IND_T6_C2
** Dynamic sampling updated table stats.: blocks=4528
*** 2012-02-14 18:19:16.531
** Generated dynamic sampling query:
    query text : 
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("T6") FULL("T6") NO_PARALLEL_INDEX("T6") */ 1 AS C1, CASE WHEN "T6"."C2"=:B1 THEN 1 ELSE 0 END AS C2 FROM "T6" SAMPLE BLOCK (1.391343 , 1) SEED (1) "T6") SAMPLESUB
=====================
PARSING IN CURSOR #19 len=418 dep=1 uid=164 oct=3 lid=164 tim=368717503 hv=4142090939 ad='fc89f0a0'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("T6") FULL("T6") NO_PARALLEL_INDEX("T6") */ 1 AS C1, CASE WHEN "T6"."C2"=:B1 THEN 1 ELSE 0 END AS C2 FROM "T6" SAMPLE BLOCK (1.391343 , 1) SEED (1) "T6") SAMPLESUB
END OF STMT
PARSE #19:c=0,e=340,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,tim=368717502
EXEC #19:c=0,e=845,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,tim=368718399
FETCH #19:c=0,e=560,p=0,cr=73,cu=0,mis=0,r=1,dep=1,og=1,tim=368718979
*** 2012-02-14 18:19:16.533
** Executed dynamic sampling query:
    level : 2
    sample pct. : 1.391343
    actual sample size : 1438
    filtered sample card. : 286
    orig. card. : 82
    block cnt. table stat. : 4528
    block cnt. for sampling: 4528
    max. sample block cnt. : 64
    sample block cnt. : 63
    min. sel. est. : 0.01000000
STAT #19 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=73 pr=0 pw=0 time=562 us)'
STAT #19 id=2 cnt=1438 pid=1 pos=1 obj=52702 op='TABLE ACCESS SAMPLE T6 (cr=73 pr=0 pw=0 time=33 us)'
** Using recursive dynamic sampling card. est. : 103353.396825
*** 2012-02-14 18:19:16.549
** Generated dynamic sampling query:
    query text : 
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS opt_param('parallel_execution_enabled', 'false') NO_PARALLEL(SAMPLESUB) NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0) FROM (SELECT /*+ NO_PARALLEL("T6") INDEX("T6" IND_T6_C2) NO_PARALLEL_INDEX("T6") */ 1 AS C1, 1 AS C2, 1 AS C3  FROM "T6" "T6" WHERE "T6"."C2"=:B1 AND ROWNUM <= 2500) SAMPLESUB
=====================
PARSING IN CURSOR #18 len=377 dep=1 uid=164 oct=3 lid=164 tim=368735909 hv=102564777 ad='fc89e868'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS opt_param('parallel_execution_enabled', 'false') NO_PARALLEL(SAMPLESUB) NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0), NVL(SUM(C3),0) FROM (SELECT /*+ NO_PARALLEL("T6") INDEX("T6" IND_T6_C2) NO_PARALLEL_INDEX("T6") */ 1 AS C1, 1 AS C2, 1 AS C3  FROM "T6" "T6" WHERE "T6"."C2"=:B1 AND ROWNUM <= 2500) SAMPLESUB
END OF STMT
PARSE #18:c=0,e=273,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,tim=368735908
EXEC #18:c=0,e=548,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,tim=368736505
FETCH #18:c=0,e=794,p=0,cr=7,cu=0,mis=0,r=1,dep=1,og=1,tim=368737317
*** 2012-02-14 18:19:16.551
** Executed dynamic sampling query:
    level : 2
    sample pct. : 100.000000
    actual sample size : 103353
    filtered sample card. : 2500
    filtered sample card. (index IND_T6_C2): 2500
    orig. card. : 103353
    block cnt. table stat. : 4528
    block cnt. for sampling: 4528
    max. sample block cnt. : 4294967295
    sample block cnt. : 4528
    min. sel. est. : 0.01000000
** Increasing dynamic sampling selectivity
   for predicate 0 from 0.024189 to 0.198887.
** Increasing dynamic sampling selectivity
   for predicate 1 from 0.024189 to 0.198887.
    index IND_T6_C2 selectivity est.: 0.19888734
STAT #18 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=7 pr=0 pw=0 time=797 us)'
STAT #18 id=2 cnt=2500 pid=1 pos=1 obj=0 op='VIEW  (cr=7 pr=0 pw=0 time=15 us)'
STAT #18 id=3 cnt=2500 pid=2 pos=1 obj=0 op='COUNT STOPKEY (cr=7 pr=0 pw=0 time=15 us)'
STAT #18 id=4 cnt=2500 pid=3 pos=1 obj=52703 op='INDEX RANGE SCAN IND_T6_C2 (cr=7 pr=0 pw=0 time=14 us)'
** Using dynamic sampling card. : 103353
** Dynamic sampling updated table card.
** Using single table dynamic sel. est. : 0.19888734
  Table: T6  Alias: T6     
    Card: Original: 103353  Rounded: 20556  Computed: 20555.68  Non Adjusted: 20555.68
  -----------------------------------------
  END   Single Table Cardinality Estimation
  -----------------------------------------
  Access Path: TableScan
    Cost:  357.58  Resp: 357.58  Degree: 0
      Cost_io: 355.00  Cost_cpu: 54983730
      Resp_io: 355.00  Resp_cpu: 54983730
  Access Path: index (AllEqRange)
    Index: IND_T6_C2
    resc_io: 209.00  resc_cpu: 1673491
    ix_sel: 0.19889  ix_sel_with_filters: 0.19889
    Cost: 209.08  Resp: 209.08  Degree: 1
  Best:: AccessPath: IndexRange  Index: IND_T6_C2
         Cost: 209.08  Degree: 1  Resp: 209.08  Card: 20555.68  Bytes: 0
17 02 2012
Randolf Geist

Hi Charles,

thanks for all the experiments you’ve done. It looks like you’ve identified an interesting change in behaviour between 11.2 and previous releases:

– Whereas previous releases do not generate index statistics when the CREATE INDEX is done on an empty table, 11.2 does so – as consequence the index statistics show 0 for all relevant values, whereas in pre-11.2 no statistics for the index are there

This can be seen from your snippets above:

10.2.0.5:

Index Stats::
Index: IND_T6_C2 Col#: 2 (NOT ANALYZED)
LVLS: 1 #LB: 25 #DK: 100 LB/K: 1.00 DB/K: 1.00 CLUF: 800.00

11.2.0.2:

Index Stats::
Index: IND_T6_C2 Col#: 2
LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00

The 11.2.0.2 version shows 0 for all values, but the 10.2.0.5 version shows “(NOT ANALYZED)” along with the hard coded defaults assumed in such a case.

– This has severe consequences on the costing: Without index statistics some clustering factor (I assume the default of 800, possibly depending in the default block size) gets used for calculating the cost of the table access, whereas with index statistics in place the optimizer unfortunately uses an inconsistent approach: For the index access part the leaf blocks statistics get actually overridden by Dynamic Sampling even in the presence of the “0” statistics gathered in 11.2.0.2. You’ll find something similar to the following in the 11.2.0.2 10053 trace file for the scenario where table stats are missing but index stats have been gathered:

** Dynamic sampling updated index stats.: IND_T6_C2, blocks=244

This is clearly an inconsistent behaviour because by default Dynamic Sampling is not supposed to override existing statistics. This is only to be done when the DYNAMIC_SAMPLING_EST_CDN hint is explicitly added for a particular table.

But the clustering factor is not overridden in the same way and therefore for the table access part of the plan the clustering factor is taken from the existing index statistics, in this case 0, which means that the table access is costed as 0 (formula is clustering_factor * table_selectivity)

This explains the odd results. By deleting the index statistics some clustering factor is assumed and the costing for the table access changes to something more reasonable.

– Just to add the final wrinkle: Enter 11.2.0.3 and check V$SYSTEM_FIX_CONTROL for bugno 12399886. In 11.2.0.3 the Dynamic Sampling code has been modified and a different clustering factor (probably derived from the estimated cardinality or blocks) will be used when the index stats are missing (11.2.0.3 still shows the same inconsistent behaviour in case table stats are missing but index stats are there).

This means potential plan changes for anyone who is using indexes and Dynamic Sampling…

Randolf

17 02 2012
Charles Hooper

Randolf,

Bug 12399886 is an interesting find. The 11.2.0.3 patchset information found in MOS list several dynamic sampling related fixes. The 11.2.0.2 patch for Bug 12399886 seems to be Solaris specific (if someone running 11.2.0.2 on Solaris is interested in performing some tests):

https://updates.oracle.com/Orion/PatchDetails/process_form?aru=13818718

While looking through MOS I found a couple of articles that might be of interest for one of your future dynamic sampling articles. Some of these problems have been fixed in (or before)11.2.0.3, while some of the problems are listed in MOS as still existing in 11.2.0.3:
Bug 13682550 (cardinality feedback)
Bug 12828479 (multi-column join)
Bug 13104618 (user functions)
Bug 12942119 (index statistics, no table statistics)
Bug 12569713 (sub-partition)
Bug 9362218 (literals replaced with bind variables when CURSOR_SHARING=EXACT)
Bug 12569713 (partition pruning)

Thank you for taking the time to look through the test case results, and summarizing the results into a meaningful explanation of what is happening.

16 02 2012
Charles Hooper

Randolf,

I started to wonder if maybe deferred segment creation on 11.2.0.1/11.2.0.2 combined with creating the index immediately after creating the table structure might be contributing to the LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00 index statistics that are found in the 11.2.0.2 trace file, while LVLS: 1 #LB: 25 #DK: 100 LB/K: 1.00 DB/K: 1.00 CLUF: 800.00 appears in the 10.2.0.5 trace file. 10.2.0.5, of course, did not support deferred segment creation (link to the documentation http://docs.oracle.com/cd/E11882_01/server.112/e17120/tables002.htm#CHDGJAGB ). I tested that theory with the following test script:

/* Deferred segment creation version */
DROP TABLE T6 PURGE;
 
CREATE TABLE T6 (
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 VARCHAR2(300) NOT NULL);
 
CREATE INDEX IND_T6_C2 ON T6(C2);
 
INSERT INTO
  T6
SELECT
  ROWNUM C1,
  DECODE(MOD(ROWNUM,5),0,0,ROWNUM) C2,
  RPAD('A',300,'A') C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;
 
COMMIT; 
 
ALTER SESSION SET TRACEFILE_IDENTIFIER = '1005310046TraceDef';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
 
SET LINESIZE 120
SET PAGESIZE 1000
VAR V2 NUMBER
 
EXEC :V2:=0
 
SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';
/* Non-deferred segment creation version */
DROP TABLE T6 PURGE;
 
CREATE TABLE T6 (
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 VARCHAR2(300) NOT NULL)
 SEGMENT CREATION IMMEDIATE;
 
CREATE INDEX IND_T6_C2 ON T6(C2);
 
INSERT INTO
  T6
SELECT
  ROWNUM C1,
  DECODE(MOD(ROWNUM,5),0,0,ROWNUM) C2,
  RPAD('A',300,'A') C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;
 
COMMIT; 

ALTER SESSION SET TRACEFILE_IDENTIFIER = '1005310046TraceNotDef';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
 
SET LINESIZE 120
SET PAGESIZE 1000
VAR V2 NUMBER
 
EXEC :V2:=0
 
SELECT /*+ GATHER_PLAN_STATISTICS */
  C1
FROM
  T6
WHERE
  C2=:V2;
 
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST +PEEKED_BINDS +COST'));
 
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

Disabling deferred segment creation made no difference. This is the execution plan when deferred segment creation is disabled for the table:

SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |    49 (100)|  20000 |00:00:00.02 |    6974 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  17609 |    49   (0)|  20000 |00:00:00.02 |    6974 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |  17609 |    49   (0)|  20000 |00:00:00.01 |    1404 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)
 
Note
-----
   - dynamic sampling used for this statement (level=2)

Standard Edition, of course, does not support deffered segment creation, so I decided to repeat the test on 11.2.0.2 Standard Edition – the results certainly are different, but not better (the Cost column value for the index access is shown as 1):

SQL_ID  bxnurhc1nmq2g, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */   C1 FROM   T6 WHERE   C2=:V2
 
Plan hash value: 4242288932
 
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     1 (100)|  20000 |00:00:00.05 |    6977 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T6        |      1 |  16675 |     1   (0)|  20000 |00:00:00.05 |    6977 |
|*  2 |   INDEX RANGE SCAN          | IND_T6_C2 |      1 |    336 |     1   (0)|  20000 |00:00:00.02 |    1412 |
----------------------------------------------------------------------------------------------------------------
 
Peeked Binds (identified by position):
--------------------------------------
   1 - (NUMBER): 0
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=:V2)
 
Note
-----
   - dynamic sampling used for this statement (level=2)
17 02 2012
Narendra

Randolf,

I am not sure about how much one can expect dynamic sampling (as a concept) to do. Since Dynamic Sampling is expected to “help” in the event of absence of statistics or stale statistics, one could argue that, in theory, dynamic sampling should try to generate accurate data for table as well as all indexes relevant to the query. But I also think that might make parse step so expensive and/or time-consuming that the original benifit will be lost. I must admit I am still not sure I understand how it (dynamic sampling) works actually, especially when it comes to indexes (B*Tree or Bitmap).
As for the “results did not change” comment, I was only focussing on the plan changes and not costing, in particular.

18 02 2012
Randolf Geist

Charles,

I can’t reply to your reply … probably too nested already. Anyway, thanks a lot for the list of pointers to the MOS notes. Some of them I already knew but I’ll certainly go through them when preparing the remaining Dynamic Sampling articles.

Thanks!

Randolf

18 02 2012
Randolf Geist

And by the way, Bug 12942119 (index statistics, no table statistics) seems to fit quite nicely what you’ve observed here. Since it’s an unpublished bug I can’t tell from the description whether they’ve changed the way the existing index stats get treated / overridden or if they changed the way Dynamic Sampling works when index stats for an empty index (all numbers 0) exist.

The fix hasn’t been included in any official patch sets yet, but there are a lot of one-off patches already available and the special patch sets for the Windows platform include them already in 11.2.0.2 releases.

It would be interesting to see if the patch changes the result of the test case here.

Randolf

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 141 other followers

%d bloggers like this: