September 22, 2010
In a recent OTN thread a person asked an interesting question: why isn’t my index being used? A query of a table with 8,000,000 rows should quickly return exactly 3 rows when an available index is used, and that index is used when the WHERE clause is simply:
WHERE C200000020 LIKE 'BOSS' || '%'
However, the application is submitting a WHERE clause that includes an impossible condition in an OR clause, like the following, which is not much different from stating OR 1=2:
WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ''
That constant1 = constant2 predicate, at least on Oracle Database 10.1 and above, is sufficient to keep the index from being used, thus the query performs a full table scan. But why?
I think that we need a test case to see what is happening. First, we will create a simple table with our column of interest and a large column that should help to discourage full table scans:
CREATE TABLE T1 ( C200000020 VARCHAR2(20), PADDING VARCHAR2(250));
Next, we will insert 10,000,000 rows into the table such that an index built on the column C200000020 will have a very high clustering factor, and 3 rows will have a value that begins with BOSS (as a result of the DECODE statement):
INSERT INTO T1 SELECT DECODE(MOD(ROWNUM,3000000),0,'BOSS'||ROWNUM, CHR(90-MOD(ROWNUM-1,26))|| CHR(75+MOD(ROWNUM,10))|| CHR(80+MOD(ROWNUM,5))|| 'S'||ROWNUM) C200000020, RPAD('A',200,'A') PADDING FROM (SELECT ROWNUM RN FROM DUAL CONNECT BY LEVEL<=10000) V1, (SELECT ROWNUM RN FROM DUAL CONNECT BY LEVEL<=1000) V2;
Now to create the index and collect statistics:
CREATE INDEX IND_T1 ON T1(C200000020); EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
Let’s take a look at the execution plans:
SET AUTOTRACE TRACEONLY EXPLAIN SELECT * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%'; Plan hash value: 634656657 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 213 | 5 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 213 | 5 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_T1 | 1 | | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("C200000020" LIKE 'BOSS%') filter("C200000020" LIKE 'BOSS%')
An index access, just like we had hoped. The optimizer is predicting a single row to be retrieved. Let’s try the other query:
SELECT * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ''; Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 20M| 82353 (1)| 00:16:29 | |* 1 | TABLE ACCESS FULL| T1 | 100K| 20M| 82353 (1)| 00:16:29 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter('BOSS'='' OR "C200000020" LIKE 'BOSS%')
A full table scan, just like the original poster in the OTN thread experienced. Notice that the optimizer is now predicting that 100,000 rows (1% of the rows) will be retrieved. Repeating, 1% of the rows and a full table scan. Let’s generate a 10053 trace for the SQL statement:
ALTER SYSTEM FLUSH SHARED_POOL; SET AUTOTRACE OFF ALTER SESSION SET TRACEFILE_IDENTIFIER = 'T1_10053'; ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1'; SELECT * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ''; ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';
Inside the 10053 trace, my 11.2.0.1 test database produced the following:
*************************************** BASE STATISTICAL INFORMATION *********************** Table Stats:: Table: T1 Alias: T1 #Rows: 10000000 #Blks: 303031 AvgRowLen: 213.00 Index Stats:: Index: IND_T1 Col#: 1 LVLS: 2 #LB: 32323 #DK: 9939968 LB/K: 1.00 DB/K: 1.00 CLUF: 10120176.00 Access path analysis for T1 *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for T1[T1] Table: T1 Alias: T1 Card: Original: 10000000.000000 Rounded: 100001 Computed: 100001.02 Non Adjusted: 100001.02 Access Path: TableScan Cost: 82352.89 Resp: 82352.89 Degree: 0 Cost_io: 82073.00 Cost_cpu: 7150017105 Resp_io: 82073.00 Resp_cpu: 7150017105 ****** trying bitmap/domain indexes ****** ****** finished trying bitmap/domain indexes ****** Best:: AccessPath: TableScan Cost: 82352.89 Degree: 1 Resp: 82352.89 Card: 100001.02 Bytes: 0 *************************************** OPTIMIZER STATISTICS AND COMPUTATIONS *************************************** GENERAL PLANS *************************************** Considering cardinality-based initial join order. Permutations for Starting Table :0 Join order[1]: T1[T1]#0 *********************** Best so far: Table#: 0 cost: 82352.8851 card: 100001.0195 bytes: 21300213 *********************** (newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:2000 ********************************* Number of join permutations tried: 1 *********************************
The unknown result of the constant in the WHERE clause (‘BOSS’ = ”) caused Oracle to predict that the cardinality will be (1 row) + (1% of the rows) = 100,001. With a clustering factor of 10,120,176 the optimizer is (possibly) convinced that it will need to perform single block physical reads of a large number of table blocks to read the 100,001 rows that it expects to retrieve, so it decided that a full table scan would complete faster. But the situation is worse than that – it did not even consider an index access path. As a demonstration, I will manually set the index’s clustering factor to a low value and check the execution plan again:
EXEC DBMS_STATS.SET_INDEX_STATS(OWNNAME=>USER,INDNAME=>'IND_T1',CLSTFCT=>10000) SET AUTOTRACE TRACEONLY EXPLAIN ALTER SYSTEM FLUSH SHARED_POOL; SELECT * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ''; Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 20M| 82353 (1)| 00:16:29 | |* 1 | TABLE ACCESS FULL| T1 | 100K| 20M| 82353 (1)| 00:16:29 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter('BOSS'='' OR "C200000020" LIKE 'BOSS%')
Still a full table scan. If we had generated a 10053 trace, we would see that the clustering factor for the index was indeed adjusted from what we saw earlier:
*************************************** BASE STATISTICAL INFORMATION *********************** Table Stats:: Table: T1 Alias: T1 #Rows: 10000000 #Blks: 303031 AvgRowLen: 213.00 Index Stats:: Index: IND_T1 Col#: 1 LVLS: 2 #LB: 32323 #DK: 9939968 LB/K: 1.00 DB/K: 1.00 CLUF: 10000.00 Access path analysis for T1 *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for T1[T1] Table: T1 Alias: T1 Card: Original: 10000000.000000 Rounded: 100001 Computed: 100001.02 Non Adjusted: 100001.02 Access Path: TableScan Cost: 82352.89 Resp: 82352.89 Degree: 0 Cost_io: 82073.00 Cost_cpu: 7150017105 Resp_io: 82073.00 Resp_cpu: 7150017105 ****** trying bitmap/domain indexes ****** ****** finished trying bitmap/domain indexes ****** Best:: AccessPath: TableScan Cost: 82352.89 Degree: 1 Resp: 82352.89 Card: 100001.02 Bytes: 0
Let’s force the execution plan with an index hint to see what happens:
SELECT /*+ INDEX(T1 IND_T1) */ * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ''; Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 20M| 82353 (1)| 00:16:29 | |* 1 | TABLE ACCESS FULL| T1 | 100K| 20M| 82353 (1)| 00:16:29 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter('BOSS'='' OR "C200000020" LIKE 'BOSS%')
Note that the optimizer did not (or could not) obey the hint. It decided to apply the ‘BOSS’=” predicate first, so maybe that is the problem. Let’s try a hint to force the optimizer to apply the predicates in order, rather than based on calculated cost:
SELECT /*+ ORDERED_PREDICATES INDEX(T1 IND_T1) */ * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ''; Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 20M| 82353 (1)| 00:16:29 | |* 1 | TABLE ACCESS FULL| T1 | 100K| 20M| 82353 (1)| 00:16:29 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("C200000020" LIKE 'BOSS%' OR 'BOSS'='')
The predicate section of the execution plan changed, but the optimizer still will not consider an index access path for the SQL statement. There is a chance that the OP could do something to force the index access path by hacking a stored outline for the query, but my guess is that the restriction on the C200000020 column changes from time to time, so an outline likely will not work. The OP could try to file an Oracle bug report because the optimizer completely ignored the index access paths (as shown in the 10053 trace file), but a better course of action would be to have the application submitting the SQL statement fixed.
Let’s try a small variation on the original test SQL statement. Let’s see what happens when we add a space between the two ‘ characters:
SET AUTOTRACE TRACEONLY EXPLAIN ALTER SESSION SET TRACEFILE_IDENTIFIER = 'T1_10053-3'; ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1'; SELECT * FROM T1 WHERE C200000020 LIKE 'BOSS' || '%' OR 'BOSS' = ' '; Execution Plan ---------------------------------------------------------- Plan hash value: 634656657 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 213 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 213 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_T1 | 1 | | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("C200000020" LIKE 'BOSS%') filter("C200000020" LIKE 'BOSS%') ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';
Notice by looking at the Predicate Information section of the plan that Oracle removed the nonsensical OR clause. The 10053 trace file showed this:
*************************************** BASE STATISTICAL INFORMATION *********************** Table Stats:: Table: T1 Alias: T1 #Rows: 10000000 #Blks: 303031 AvgRowLen: 213.00 Index Stats:: Index: IND_T1 Col#: 1 LVLS: 2 #LB: 32323 #DK: 9939968 LB/K: 1.00 DB/K: 1.00 CLUF: 10000.00 Access path analysis for T1 *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for T1[T1] Table: T1 Alias: T1 Card: Original: 10000000.000000 Rounded: 1 Computed: 1.03 Non Adjusted: 1.03 Access Path: TableScan Cost: 82255.34 Resp: 82255.34 Degree: 0 Cost_io: 82073.00 Cost_cpu: 4658017105 Resp_io: 82073.00 Resp_cpu: 4658017105 kkofmx: index filter:"T1"."C200000020" LIKE 'BOSS%' Access Path: index (RangeScan) Index: IND_T1 resc_io: 4.00 resc_cpu: 29226 ix_sel: 0.000000 ix_sel_with_filters: 0.000000 Cost: 4.00 Resp: 4.00 Degree: 1 Best:: AccessPath: IndexRange Index: IND_T1 Cost: 4.00 Degree: 1 Resp: 4.00 Card: 1.03 Bytes: 0
So, it appears that if the optimizer is presented with a zero length VARCHAR being compared with another VARCHAR in the WHERE clause, there could be unexpected cases were index access paths will not be used even when hinted.
Toon Koppelaars mentioned in the OTN thread that the WHERE clause should be using bind variables, and suggested the following for the WHERE clause:
( (T131.C200000020 LIKE (:B0 || '%')) OR (:B0 IS NULL))
I agree with Toon regarding the use of bind variables. Unfortunately, it does not look like bind variables improve the situation, at least in my test case.
I cannot use AUTOTRACE due to the risk that it will display an incorrect execution plan due to the bind variables, so I will use DBMS_XPLAN.DISPLAY_CURSOR along with a GATHER_PLAN_STATISTICS hint in the SQL statement. First the statistics collection (to correct any manual adjustment to the index’s clustering factor that was performed earlier) and bind variable setup:
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE) VARIABLE B0 VARCHAR2(50) VARIABLE B1 VARCHAR2(50) VARIABLE B2 VARCHAR2(50) EXEC :B0 := 'BOSS' EXEC :B1 := 'BOSS' EXEC :B2 := '' SET AUTOTRACE OFF
Since we do not know the intention of the developer, I will try a couple of combinations to see what happens:
SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B1 = :B2; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); SQL_ID b47qzqbb6wymu, child number 0 ------------------------------------- SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B1 = :B2 Plan hash value: 3617692013 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 3 |00:00:00.34 | 303K| |* 1 | TABLE ACCESS FULL| T1 | 1 | 100K| 3 |00:00:00.34 | 303K| ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter((:B1=:B2 OR "C200000020" LIKE :B0||'%'))
The above resulted in the same full table scan that we saw earlier.
SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B0 = :B2; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); SQL_ID 1vdyc7t6wazhz, child number 0 ------------------------------------- SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B0 = :B2 Plan hash value: 3617692013 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 3 |00:00:00.34 | 303K| |* 1 | TABLE ACCESS FULL| T1 | 1 | 100K| 3 |00:00:00.34 | 303K| ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter((:B0=:B2 OR "C200000020" LIKE :B0||'%'))
Specifying the B0 bind variable twice did not help.
SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B0 IS NULL; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); SQL_ID 8n1bg0z9j0001, child number 0 ------------------------------------- SELECT /*+ GATHER_PLAN_STATISTICS */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B0 IS NULL Plan hash value: 3617692013 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 3 |00:00:00.34 | 303K| |* 1 | TABLE ACCESS FULL| T1 | 1 | 500K| 3 |00:00:00.34 | 303K| ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter((:B0 IS NULL OR "C200000020" LIKE :B0||'%'))
Specifying that the B0 bind variable IS NULL did not help either, but notice the change in the predicted cardinality (the 100,000 rows increased to 500,000 rows).
Let’s try an index hint:
SELECT /*+ GATHER_PLAN_STATISTICS INDEX(T1 IND_T1) */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B0 IS NULL; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); SQL_ID a241dy7mvudtk, child number 0 ------------------------------------- SELECT /*+ GATHER_PLAN_STATISTICS INDEX(T1 IND_T1) */ * FROM T1 WHERE C200000020 LIKE :B0 || '%' OR :B0 IS NULL Plan hash value: 3617692013 ------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 3 |00:00:00.34 | 303K| |* 1 | TABLE ACCESS FULL| T1 | 1 | 500K| 3 |00:00:00.34 | 303K| ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter((:B0 IS NULL OR "C200000020" LIKE :B0||'%'))
The optimizer still did not use the index.
—————–
Cases where Oracle’s optimizer ignores index hints are typically indications of bugs in the optimizer – as we saw, the optimizer did not even consider (generate a calculated cost for) an index access path when no space appeared between the two ‘ characters in the original SQL statement. Other cases of Oracle’s optimizer ignoring hints may be found here: Demonstration of Oracle “Ignoring” an Index Hint.
Recent Comments