May 25, 2010
(Back to the Previous Post in the Series)
The book “Expert One-On-One Oracle” has been on the edge of my desk for the last couple of days. I cannot recall why I originally pulled it off the bookshelf, but possibly it was needed to satisfy a question about how things used to work in Oracle Database. I opened the book to a random page, and ended up somewhere around page 303. Page 303 includes a section title of “Why isn’t my Index Getting Used?”. This still seems to be a frequent question on various Oracle forums, and volumes have been written on the subject. A quick Google search finds the following on the topic:
- http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:736825544526
- http://jonathanlewis.wordpress.com/2007/02/15/index-not-used-10g/
- http://dbaforums.org/oracle/index.php?showtopic=1939
- http://www.freelists.org/post/oracle-l/Index-not-being-used
- http://www.freelists.org/post/oracle-l/Curious-as-to-why-this-index-isnt-being-used
- http://forums.oracle.com/forums/thread.jspa?threadID=1038481
- http://forums.oracle.com/forums/thread.jspa?threadID=1071104
The “Expert One-On-One Oracle” book lists 6 of the cases why an index would not be used. Some things in Oracle stay the same from one release to the next, while others change. Which of the following are true or false for Oracle Database 11.1.0.6 or higher, and indicate why the answer is true or false, and if relevant, the Oracle release when the answer changed. Note that I might intentionally state only part of the case identified in the book, in part to make things more interesting. Maybe the question of when is it true and when is it not true would be a more realistic request?
Reasons why an index would not be used:
1. The predicate in the WHERE clause does not specify the column at the leading edge of the index. For example, an index exists on T1(C1, C2) and the WHERE clause in the query is simply WHERE C2=1.
2. The query is attempting to count the number of rows in the table, for example: SELECT COUNT(*) FROM T1;.
3. The query is attempting to apply a function on the indexed columns, for example: SELECT * FROM T1 WHERE TRUNC(DATE_COL) = TRUNC(SYSDATE);.
4. The query is attempting to treat a VARCHAR2/CHAR column that contains only numbers as a numeric column, for example: SELECT * FROM T1 WHERE CHAR_COL = 10;.
5. The index, if used, would be slower, as determined by the calculated cost of the access path.
6. The tables have not been analyzed in a while. The last time the tables referenced in the WHERE clause were analyzed, the tables were quite small. Now the tables are significantly larger. Without up-to-date statistics, the CBO cannot make the correct decisions.
—
Extra credit: list three other reasons why an index might not be used.
(10053 trace for second test of table T1 in the comments section: OR11_ora_4400_WhyFullTableScanTest – Edit: this is a PDF file, WordPress is not allowing the double extention trick nor Word formatted files for some reason.)
1. Can Use Index Skip Scan
2. Non Of the column are not null
3. Not a Function base index Or predicates conversion
4. Internal conversion of predicates
5. May be FTS will be more beneficiary or may be hash join is more efficient then nested loops.
6. This depends like it calculate cost of FTS is better
For Extra credit
1. Foreign key index are not available
2. Clustering Factor
>list three other reasons why an index might not be used.
1. there is no such index
2. it is unusable
3. it is nosegment
I think that the earlier comments are headed in the right direction.
I will post something to help with #4 (I will not explain the meaning of this right now).
Oracle Database 11.1.0.7:
Let’s see if I can cause a little bit of temporary confusion (or at least a stop and think moment). I re-ran the above test on another box with Oracle Database 11.1.0.7 installed and receive some unexpected results. Look! Oracle’s optimizer did not automatically select the cheapest plan until an index hint was supplied… the obvious conclusion?
Notice the cost in the above is 4,416 with a predicted time of 53 seconds. Now look what happens when the access path is hinted (force, if possible) to use an index access path:
A lower calculated cost and a lower calculated execution time. Must be that…
The optimizer parameters, for the most part, are the Oracle Database defaults:
(10053 trace file for test case OR11_ora_4400_WhyFullTableScanTest.trc edit: link to the trace file is fixed)
—-
Let’s try one more test. This time we will define a column as NUMBER and try essentially the same test, just with a WHERE clause with a string (VARCHAR2) specified as a predicate in the WHERE clause:
Fun bonus question. Why is there an implicit datatype conversion of the table’s column in the first test case (table T1), but instead an implicit datatype conversion of the constant in the second test case (table T2).
“Fun bonus question. Why is there an implicit datatype conversion of the table’s column in the first test case (table T1), but instead an implicit datatype conversion of the constant in the second test case (table T2).”
If I had to wager a guess I would say this is due to the implicit data conversion rules outlined in the SQL Reference Manual (11.1):
http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/sql_elements002.htm#i163326
It states:
“When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.”
In the first case since the column, CHAR_COL is a VARCHAR2 type and it is being compared to a constant of the NUMBER data type Oracle converts the column value to a NUMBER. This is also seen in the Predicate Information section.
In the second case the column value is a NUMBER data type and it is being compared to a constant of the VARCHAR2 data type. Therefore the constant is then converted to a NUMBER, leaving the NUM_COL unaffected from data type conversion.
Interestingly enough Oracle also states the following in the SQL Reference Manual:
“Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.”
Centinul,
Your answer is better than the one that I had in mind. The next line after your first quote of the documentation reference explains why the conversion is performed in only one direction: CHARACTER–>NUMBER. Here is an example:
Now consider the following – should all three queries return the same set of rows:
If the number was automatically converted by Oracle into a character value, it might need to test a nearly infinite number of 0 characters appended to the end of the converted value after the decimal point (up to the number of characters of precision) for a matching result – this is avoided by converting the character value to a number.
I did not have that exact answer in mind until I saw the documentation reference that you provided.
Well you defined char_col as a varchar2 data type. However, in the predicate you are treating as a number data type.
WHERE char_col = 10
I believe Oracle does implicit conversion here, and therefore cannot use the index because of the data type issue.
In both scenarios you are asking Oracle to compare apples to oranges. Varchar2 to Number, and vice versa.
I saw Tom Kyte at OOW last year do a presentation on the importance of metadata. One of his examples showed why it is important to store numbers as number data type, dates as date data type and strings as varchar2 data type. And how not doing so really messes up the CBO.
Jimmy,
Good point.
Immediately after I saw your post I realized that the statement to which you replied in my comment (Fun bonus question) could be read two different ways. After seeing your post I replaced “and” with “, but instead” so that I could clarify that I was seeking the reason why the two cases were treated differently, rather than suggesting that they were both the same because they both had a similar problem (apples to oranges comparison – an implicit data type conversion as you correctly stated). I am sorry if this change caused confusion.
The link to the 10053 trace file now works if someone would like to take a peek.
There may be also the following reasons:
1) usage hint NO_INDEX
2) modification parameter OPTIMIZER_INDEX_COST_ADJ
3) Data Distribution
Donatello,
A portion of your post between “WHERE ROWNUM” and “SELECT * FROM TAB1 WHERE N1 = 100;” disappeared when you posted it. I tried to recreate/fix that section of your post. Thank you for posting a test case for the data distribution possibility.
I believe that this test case is also suitable for Taral’s “Clustering Factor” answer.
yes,
missing piece
CREATE INDEX TAB1_IX1 ON TAB1(N1);
CREATE INDEX TAB2_IX1 ON TAB2(N1);
thanks
Yes, my test case is also suitable response Taral
I had not seen the answer on the clustering factor of Taral
thanks
This has been bothering me about exadata since it was announced; if the db optimizer can’t get always get the best plan, why would we expect a hardware filter to do any better? This seems more fundamental to me than, say, people being sloppier with their programming because they can brute-force with hardware.
I’ve been thinking about this a lot since I have a situation where a report generator spits out sql that causes the db to create a view and doesn’t push predicates up into the view, leading to full table scans within nested loops. So adding the really obvious index has no effect. Every time I run into this kind of thing I slap my forehead, it’s going to be a lot of work one way or another, can’t just hack some hints. The thought of an outlines solution gives me nightmares like comment #8 on http://jonathanlewis.wordpress.com/2008/02/17/pushing-predicates-2/ . Wondering why this doesn’t happen on similar processes leads me to discover it does, just no one has complained yet, users leave orphaned processes running, thinking they’ve canceled them.
If it were just me, no one would care. But I’m pretty sure I’m not the only one working on enterprise software with these kinds of issues.