True or False – Why Isn’t My Index Getting Used?

25 05 2010

May 25, 2010

(Back to the Previous Post in the Series

The book “Expert One-On-One Oracle” has been on the edge of my desk for the last couple of days.  I cannot recall why I originally pulled it off the bookshelf, but possibly it was needed to satisfy a question about how things used to work in Oracle Database.  I opened the book to a random page, and ended up somewhere around page 303.  Page 303 includes a section title of “Why isn’t my Index Getting Used?”.  This still seems to be a frequent question on various Oracle forums, and volumes have been written on the subject.  A quick Google search finds the following on the topic:

The “Expert One-On-One Oracle” book lists 6 of the cases why an index would not be used.  Some things in Oracle stay the same from one release to the next, while others change.  Which of the following are true or false for Oracle Database 11.1.0.6 or higher, and indicate why the answer is true or false, and if relevant, the Oracle release when the answer changed.  Note that I might intentionally state only part of the case identified in the book, in part to make things more interesting.  Maybe the question of when is it true and when is it not true would be a more realistic request?

Reasons why an index would not be used:

1. The predicate in the WHERE clause does not specify the column at the leading edge of the index.  For example, an index exists on T1(C1, C2) and the WHERE clause in the query is simply WHERE C2=1.

2. The query is attempting to count the number of rows in the table, for example: SELECT COUNT(*) FROM T1;.

3. The query is attempting to apply a function on the indexed columns, for example: SELECT * FROM T1 WHERE TRUNC(DATE_COL) = TRUNC(SYSDATE);.

4. The query is attempting to treat a VARCHAR2/CHAR column that contains only numbers as a numeric column, for example: SELECT * FROM T1 WHERE CHAR_COL = 10;.

5. The index, if used, would be slower, as determined by the calculated cost of the access path.

6. The tables have not been analyzed in a while.  The last time the tables referenced in the WHERE clause were analyzed, the tables were quite small.  Now the tables are significantly larger.  Without up-to-date statistics, the CBO cannot make the correct decisions.

Extra credit: list three other reasons why an index might not be used.

(10053 trace for second test of table T1 in the comments section: OR11_ora_4400_WhyFullTableScanTest – Edit: this is a PDF file, WordPress is not allowing the double extention trick nor Word formatted files for some reason.)


Actions

Information

14 responses

25 05 2010
Taral

1. Can Use Index Skip Scan
2. Non Of the column are not null
3. Not a Function base index Or predicates conversion
4. Internal conversion of predicates
5. May be FTS will be more beneficiary or may be hash join is more efficient then nested loops.
6. This depends like it calculate cost of FTS is better

For Extra credit
1. Foreign key index are not available
2. Clustering Factor

25 05 2010
Timur Akhmadeev

>list three other reasons why an index might not be used.
1. there is no such index
2. it is unusable
3. it is nosegment

25 05 2010
Charles Hooper

I think that the earlier comments are headed in the right direction.

I will post something to help with #4 (I will not explain the meaning of this right now).

Oracle Database 11.1.0.7:

CREATE TABLE T1 (
  CHAR_COL VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (CHAR_COL));
 
INSERT INTO
  T1
SELECT
  TO_CHAR(ROWNUM),
  RPAD('A',100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL <=1000000;
 
COMMIT;
 
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
 
SET AUTOTRACE TRACEONLY EXPLAIN
 
SELECT
  *
FROM
  T1
WHERE
  CHAR_COL = 10;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   107 |  1960   (2)| 00:00:08 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   107 |  1960   (2)| 00:00:08 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(TO_NUMBER("CHAR_COL")=10)
 
---
 
SELECT /*+ INDEX(T1) */
  *
FROM
  T1
WHERE
  CHAR_COL = 10;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2986104799
 
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   107 |  3944   (1)| 00:00:16 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |     1 |   107 |  3944   (1)| 00:00:16 |
|*  2 |   INDEX FULL SCAN           | SYS_C0016078 |     1 |       |  3943   (1)| 00:00:16 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TO_NUMBER("CHAR_COL")=10)
25 05 2010
Charles Hooper

Let’s see if I can cause a little bit of temporary confusion (or at least a stop and think moment). I re-ran the above test on another box with Oracle Database 11.1.0.7 installed and receive some unexpected results. Look! Oracle’s optimizer did not automatically select the cheapest plan until an index hint was supplied… the obvious conclusion?

CREATE TABLE T1 (
  CHAR_COL VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (CHAR_COL));
 
INSERT INTO
  T1
SELECT
  TO_CHAR(ROWNUM),
  RPAD('A',100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL <=1000000;
 
COMMIT;
 
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
 
SET AUTOTRACE TRACEONLY EXPLAIN
 
SELECT
  *
FROM
  T1
WHERE
  CHAR_COL = 10;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   107 |  4416   (1)| 00:00:53 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   107 |  4416   (1)| 00:00:53 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(TO_NUMBER("CHAR_COL")=10)

Notice the cost in the above is 4,416 with a predicted time of 53 seconds. Now look what happens when the access path is hinted (force, if possible) to use an index access path:

 
SELECT /*+ INDEX(T1) */
  *
FROM
  T1
WHERE
  CHAR_COL = 10;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2890641821
 
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   107 |  3930   (1)| 00:00:48 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |     1 |   107 |  3930   (1)| 00:00:48 |
|*  2 |   INDEX FULL SCAN           | SYS_C0030633 |     1 |       |  3929   (1)| 00:00:48 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TO_NUMBER("CHAR_COL")=10)

A lower calculated cost and a lower calculated execution time. Must be that…

The optimizer parameters, for the most part, are the Oracle Database defaults:

SET AUTOTRACE OFF
SHOW PARAMETER OPTIMIZER
  
NAME                                 TYPE        VALUE
------------------------------------ ----------- --------
optimizer_capture_sql_plan_baselines boolean     FALSE
optimizer_dynamic_sampling           integer     2
optimizer_features_enable            string      11.1.0.7
optimizer_index_caching              integer     0
optimizer_index_cost_adj             integer     100
optimizer_mode                       string      ALL_ROWS
optimizer_secure_view_merging        boolean     TRUE
optimizer_use_invisible_indexes      boolean     FALSE
optimizer_use_pending_statistics     boolean     FALSE
optimizer_use_sql_plan_baselines     boolean     TRUE

(10053 trace file for test case OR11_ora_4400_WhyFullTableScanTest.trc edit: link to the trace file is fixed)
—-

Let’s try one more test. This time we will define a column as NUMBER and try essentially the same test, just with a WHERE clause with a string (VARCHAR2) specified as a predicate in the WHERE clause:

CREATE TABLE T2 (
  NUM_COL NUMBER,
  C2 VARCHAR2(100),
  PRIMARY KEY (NUM_COL));

INSERT INTO
  T2
SELECT
  ROWNUM,
  RPAD('A',100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL <=1000000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  *
FROM
  T2
WHERE
  NUM_COL = '10';
 
Execution Plan
----------------------------------------------------------
Plan hash value: 621649307

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   105 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2           |     1 |   105 |     3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | SYS_C0030634 |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("NUM_COL"=10)

Fun bonus question. Why is there an implicit datatype conversion of the table’s column in the first test case (table T1), but instead an implicit datatype conversion of the constant in the second test case (table T2).

26 05 2010
Centinul

“Fun bonus question. Why is there an implicit datatype conversion of the table’s column in the first test case (table T1), but instead an implicit datatype conversion of the constant in the second test case (table T2).”

If I had to wager a guess I would say this is due to the implicit data conversion rules outlined in the SQL Reference Manual (11.1):

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/sql_elements002.htm#i163326

It states:

“When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.”

In the first case since the column, CHAR_COL is a VARCHAR2 type and it is being compared to a constant of the NUMBER data type Oracle converts the column value to a NUMBER. This is also seen in the Predicate Information section.

In the second case the column value is a NUMBER data type and it is being compared to a constant of the VARCHAR2 data type. Therefore the constant is then converted to a NUMBER, leaving the NUM_COL unaffected from data type conversion.

Interestingly enough Oracle also states the following in the SQL Reference Manual:

“Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.”

26 05 2010
Charles Hooper

Centinul,

Your answer is better than the one that I had in mind. The next line after your first quote of the documentation reference explains why the conversion is performed in only one direction: CHARACTER–>NUMBER. Here is an example:

CREATE TABLE T3(
  CHAR_COL VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (CHAR_COL));
 
INSERT INTO T3 VALUES('1','A');
INSERT INTO T3 VALUES('1.0','A');
INSERT INTO T3 VALUES('1.00','A');
INSERT INTO T3 VALUES('1.000','A');
INSERT INTO T3 VALUES('1.0000','A');
INSERT INTO T3 VALUES('1.00000','A');
INSERT INTO T3 VALUES('1.000000','A');
INSERT INTO T3 VALUES('1.0000000','A');
INSERT INTO T3 VALUES('1.00000000','A');
 
COMMIT;

Now consider the following – should all three queries return the same set of rows:

SELECT
  *
FROM
  T3
WHERE
  CHAR_COL=1;
 
CHAR_COL   C2
---------- -----
1          A
1.0        A
1.00       A
1.000      A
1.0000     A
1.00000    A
1.000000   A
1.0000000  A
1.00000000 A
 
9 rows selected.
 
---
 
SELECT
  *
FROM
  T3
WHERE
  TO_NUMBER(CHAR_COL)=1;
 
CHAR_COL   C2
---------- -----
1          A
1.0        A
1.00       A
1.000      A
1.0000     A
1.00000    A
1.000000   A
1.0000000  A
1.00000000 A
 
9 rows selected.
 
---
 
SELECT
  *
FROM
  T3
WHERE
  CHAR_COL=TO_CHAR(1);
 
CHAR_COL   C2
---------- -----
1          A

If the number was automatically converted by Oracle into a character value, it might need to test a nearly infinite number of 0 characters appended to the end of the converted value after the decimal point (up to the number of characters of precision) for a matching result – this is avoided by converting the character value to a number.

I did not have that exact answer in mind until I saw the documentation reference that you provided.

25 05 2010
Jimmy

Well you defined char_col as a varchar2 data type. However, in the predicate you are treating as a number data type.

WHERE char_col = 10

I believe Oracle does implicit conversion here, and therefore cannot use the index because of the data type issue.

25 05 2010
Jimmy

In both scenarios you are asking Oracle to compare apples to oranges. Varchar2 to Number, and vice versa.

I saw Tom Kyte at OOW last year do a presentation on the importance of metadata. One of his examples showed why it is important to store numbers as number data type, dates as date data type and strings as varchar2 data type. And how not doing so really messes up the CBO.

26 05 2010
Charles Hooper

Jimmy,

Good point.

Immediately after I saw your post I realized that the statement to which you replied in my comment (Fun bonus question) could be read two different ways. After seeing your post I replaced “and” with “, but instead” so that I could clarify that I was seeking the reason why the two cases were treated differently, rather than suggesting that they were both the same because they both had a similar problem (apples to oranges comparison – an implicit data type conversion as you correctly stated). I am sorry if this change caused confusion.

The link to the 10053 trace file now works if someone would like to take a peek.

26 05 2010
Donatello Settembrino

There may be also the following reasons:

1) usage hint NO_INDEX

2) modification parameter OPTIMIZER_INDEX_COST_ADJ

3) Data Distribution

CREATE TABLE TAB1 AS
SELECT TO_CHAR(ROWNUM) AS C1 
     , trunc((ROWNUM-1) / 10) N1
     , rownum * 10 N2
     , ROWNUM      N3
     , sysdate     d1 
FROM ALL_OBJECTS
WHERE ROWNUM <= 10000;


CREATE TABLE TAB2 AS
SELECT TO_CHAR(ROWNUM) AS C1 
     , MOD(ROWNUM, 1000) N1
     , rownum * 10 N2
     , ROWNUM      N3
     , sysdate     d1 
FROM ALL_OBJECTS
WHERE ROWNUM <= 10000;
 
CREATE INDEX TAB1_IX1 ON TAB1(N1);
CREATE INDEX TAB2_IX1 ON TAB2(N1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER, TABNAME=>'TAB1', CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER, TABNAME=>'TAB2', CASCADE=>TRUE)

SELECT * FROM TAB1 WHERE N1 = 100;



-----------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |    10 |   260 |     2   (0)|
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB1     |    10 |   260 |     2   (0)|
|   2 |   INDEX RANGE SCAN          | TAB1_IX1 |    10 |       |     1   (0)|
-----------------------------------------------------------------------------


SQL> SELECT * FROM TAB2 WHERE N1 = 100;


---------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |   260 |    10   (0)|
|   1 |  TABLE ACCESS FULL| TAB2 |    10 |   260 |    10   (0)|
---------------------------------------------------------------
26 05 2010
Charles Hooper

Donatello,

A portion of your post between “WHERE ROWNUM” and “SELECT * FROM TAB1 WHERE N1 = 100;” disappeared when you posted it. I tried to recreate/fix that section of your post. Thank you for posting a test case for the data distribution possibility.

I believe that this test case is also suitable for Taral’s “Clustering Factor” answer.

26 05 2010
Donatello Settembrino

yes,
missing piece

CREATE INDEX TAB1_IX1 ON TAB1(N1);
CREATE INDEX TAB2_IX1 ON TAB2(N1);

thanks

26 05 2010
Donatello Settembrino

Yes, my test case is also suitable response Taral

I had not seen the answer on the clustering factor of Taral

thanks

26 05 2010
joel garry

This has been bothering me about exadata since it was announced; if the db optimizer can’t get always get the best plan, why would we expect a hardware filter to do any better? This seems more fundamental to me than, say, people being sloppier with their programming because they can brute-force with hardware.

I’ve been thinking about this a lot since I have a situation where a report generator spits out sql that causes the db to create a view and doesn’t push predicates up into the view, leading to full table scans within nested loops. So adding the really obvious index has no effect. Every time I run into this kind of thing I slap my forehead, it’s going to be a lot of work one way or another, can’t just hack some hints. The thought of an outlines solution gives me nightmares like comment #8 on http://jonathanlewis.wordpress.com/2008/02/17/pushing-predicates-2/ . Wondering why this doesn’t happen on similar processes leads me to discover it does, just no one has complained yet, users leave orphaned processes running, thinking they’ve canceled them.

If it were just me, no one would care. But I’m pretty sure I’m not the only one working on enterprise software with these kinds of issues.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 141 other followers

%d bloggers like this: