Using Histograms to Fix Bind Peeking Problems?

30 11 2009

November 30, 2009

The following question appeared on the OTN forums (http://forums.oracle.com/forums/thread.jspa?threadID=993929):

What is the solution to [bind] variable peeking without going to 11g?
i got answer before as stored outline , but i don’t think this will fix it as stored outlines will stablise the plan which we don’t want , i think histogram is a better solution?

Consider the following test case, which might leave you wondering if creating a histogram on a column used by bind variables is a good idea.
The set up:

SHOW PARAMETER OPTIMIZER

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------
optimizer_dynamic_sampling           integer     2
optimizer_features_enable            string      10.2.0.4
optimizer_index_caching              integer     0
optimizer_index_cost_adj             integer     100
optimizer_mode                       string      ALL_ROWS
optimizer_secure_view_merging        boolean     TRUE

CREATE TABLE T10 AS
SELECT
  ROWNUM COL1,
  DECODE(MOD(ROWNUM,1000),1,1,2,2,3,3,DECODE(MOD(ROWNUM,25),10,10,11,11,25)) COL2,
  LPAD('A',255,'A') COL3
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=10000) V1,
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000) V2;

CREATE INDEX IND_T10_1 ON T10(COL1);
CREATE INDEX IND_T10_2 ON T10(COL2);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T10',CASCADE=>TRUE,METHOD_OPT=>'FOR COLUMNS SIZE 254 COL2')

SELECT
  COL2,
  COUNT(*) NUM,
  COUNT(*)/10000000*100 PERCENT
FROM
  T10
GROUP BY
  COL2
ORDER BY
  COL2;

COL2        NUM    PERCENT
---- ---------- ----------
   1      10000         .1
   2      10000         .1
   3      10000         .1
  10     400000          4
  11     400000          4
  25    9170000       91.7 

The above created a 10,000,000 row table with 6 distinct values in COL2. 0.1% of the rows have a value of 1 in COL2, and 91.7% of the rows have a value of 25 in COL2. There is an index with a histogram on COL2. Obviously (or not) if we have only COL2=1 in the WHERE clause, we probably would want to use the index on the COL2 column to retrieve rows. Obviously (or not) if we have only COL2=25 in the WHERE clause, we probably would want to use a full table scan to retrieve rows. So, what happens when bind variable peeking takes place when a histogram is present on COL2? Ignore for a moment the elapsed time that is output in the following (note that I flush the buffer cache to force physical reads for consistency – direct I/O is enabled):

VARIABLE N1 NUMBER
EXEC :N1:=1

ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SET TIMING ON

SELECT /*+ GATHER_PLAN_STATISTICS */
  COL2,
  COUNT(COL1) C1
FROM
  T10
WHERE
  COL2= :N1
GROUP BY
  COL2;

      COL2         C1
---------- ----------
         1      10000

Elapsed: 00:00:42.72

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |A-Time      | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   1 |  SORT GROUP BY NOSORT        |           |      1 |      1 |      1 |00:00:42.29 |   10022 |  10022 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T10       |      1 |   8856 |  10000 |00:00:39.03 |   10022 |  10022 |
|*  3 |    INDEX RANGE SCAN          | IND_T10_2 |      1 |   8856 |  10000 |00:00:00.06 |      22 |     22 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("COL2"=:N1)

EXEC :N1:=25

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ GATHER_PLAN_STATISTICS */
  COL2,
  COUNT(COL1) C1
FROM
  T10
WHERE
  COL2= :N1
GROUP BY
  COL2;

      COL2         C1
---------- ----------
        25    9170000

Elapsed: 00:00:32.37

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |A-Time      | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   1 |  SORT GROUP BY NOSORT        |           |      1 |      1 |      1 |00:00:32.35 |     402K|    402K|
|   2 |   TABLE ACCESS BY INDEX ROWID| T10       |      1 |   8856 |   9170K|00:00:27.57 |     402K|    402K|
|*  3 |    INDEX RANGE SCAN          | IND_T10_2 |      1 |   8856 |   9170K|00:00:09.22 |   17879 |  17879 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("COL2"=:N1)

EXEC :N1:=25

ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ GATHER_PLAN_STATISTICS */
  COL2,
  COUNT(COL1) C1
FROM
  T10
WHERE
  COL2= :N1
GROUP BY
  COL2;

      COL2         C1
---------- ----------
        25    9170000

Elapsed: 00:00:20.76

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------
|   1 |  SORT GROUP BY NOSORT|      |      1 |      1 |      1 |00:00:20.57 |     384K|    384K|
|*  2 |   TABLE ACCESS FULL  | T10  |      1 |   9234K|   9170K|00:00:27.54 |     384K|    384K|
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("COL2"=:N1)

EXEC :N1:=1

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ GATHER_PLAN_STATISTICS */
  COL2,
  COUNT(COL1) C1
FROM
  T10
WHERE
  COL2= :N1
GROUP BY
  COL2;

      COL2         C1
---------- ----------
         1      10000

Elapsed: 00:00:20.20

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------
|   1 |  SORT GROUP BY NOSORT|      |      1 |      1 |      1 |00:00:20.19 |     384K|    384K|
|*  2 |   TABLE ACCESS FULL  | T10  |      1 |   9234K|  10000 |00:00:28.73 |     384K|    384K|
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("COL2"=:N1)
 

The above shows that the first time the SQL statement is hard parsed, a bind variable value of 1 is set, which causes an index range scan regardless if the query will select 0.1% of the rows or 91.7% of the rows. OK, if we then flush the shared pool and first set the bind variable value to 25, a full table scan is used regardless if we select 91.7% of the rows or 0.1% of the rows. You will note that when the full table scan is used when the bind variable was set to 25 the query completed in 20.76 seconds, and when an index range scan was used with the same bind variable value the query completed in 32.37 seconds.

OK so far, now the potentially confusing part. When an index range scan was used for both bind variable values, Oracle counted the 0.1% of the matching rows (10000) in 42.72 seconds, while counting 91.7% of the rows (9,170,000) in just 32.37 seconds. You might be wondering why Oracle is able to return the result of counting 91.7% of the rows by the index range scan faster than it is able to count 0.1% of the rows – I will leave that for your investigation.

Now, reviewing the above, what is better?:
* Allow the bind variable values submitted during the hard parse to determine the execution plan.
* Use a stored outline to lock the execution plan to always use an index range scan.
* Use a stored outline to lock the execution plan to always use a full table scan.
* Disable bind variable peeking.
* Not enough information is available.


Actions

Information

Leave a comment