February 22, 2012
There are some amazing (or just plain interesting) facts to be found about how indexes work in Oracle Database. One Internet site in particular that has been helpful in my understanding of Oracle indexes is Richard Foote’s blog. I occasionally see information on the Internet, in the Oracle documentation, in Metalink (My Oracle Support), and in books that conflict with one another – just how does one determine what information is correct, and what information is incorrect?
If you paid money to learn something, should you just assume that what you are learning is correct? What if test cases are included, should you just assume that the test case is showing what the document claims that the test case shows? What if, you never ask “What if?” … what if, you do not know when to ask “What if?”
As I mentioned in a previous article, I am in the process of reading and reviewing the book “Oracle Database 11gR2 Performance Tuning Cookbook“, and I seem to be spending quite a bit of time asking What if … or why does this statement not agree with what I believe to be true? For this blog article, I have pulled seven quotes from the book that will be mentioned in my review of the book – I thought that I would give the readers of this blog an opportunity to ask, “What if”. As always, I attempted to keep the length of the quotes to a minimum – just long enough so that the context (and meaning) of the quote is not lost.
#1 Page 88:
“Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.”
#2 Page 119:
“We create the IX3_CUSTOMERS index to show how we can create a descending index, that is, an index which stores the data for one or more fields in descending order…
We can also create a function-based descending index.”
#3 Page 119:
“… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN. Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.
#4 Page 120:
“The last type [descending indexes] is useful when we want to use queries in which we want the data sorted in a mixed ascending/descending way on the fields indexed. If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this… ORDER BY FIELD1 DESC, FIELD2“
#5 Page 127:
“However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.”
#6 Page 127:
“If the value for DEL_LF_ROWS/LF_ROWS is greater than 2, or LF_ROWS is lower than LF_BLKS, or HEIGHT is 4 then the index should be rebuilt.”
#7 Page 139:
“When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.”
What, if anything, is wrong (and/or right) with the above quotes from the book?
The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.
#3 Page 119 – is not accurate in my experience. When you have a composite index and do not include the leading column of that index; Oracle will either A) do a full table scan or, B) use an Index Skip Scan.
I have never seen Oracle do an Index Fast Full Scan when the leading column of an index is missing.
T. J. Kiernan followed up on this same point. I *think* that it could be strongly argued that an INDEX SKIP SCAN would happen much more frequently in this case than an INDEX FAST FULL SCAN. How about someone create a test case to demonstrate what it takes to trigger an INDEX SKIP SCAN and what it takes for an INDEX FAST FULL SCAN to appear in the same situation.
If you wish to post test case code here, please use a <pre> tag before the section and a </pre> tag after the section:
<pre>
</pre>
“When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.”
When rows are frequently inserted, deleted, and updated (in multi-user-concurrent application) bitmap indexes should not be used at all.
“We can also create a function-based descending index”
If an index is created in a descending order it will be automatically of a function based index type
“However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.”
The space will be used by the subsequent insert (excluding direct path insert wich inserts above the HWM). Skewed indexes depends on the patern of the delete. If we delete an index on its left side which means it will be growing only from its right side (sequence based indexes and timestamp based indexes), index could be smached or be what tom kyte call “sweeper indexes”. And even in it that case, I think that caolescing that index is what could be advised and not rebuilding the index. There are cases where index could benefit from rebuild (for that see jonathan lewis script) but in most cases, we don’t have to rebuild indexes
Mohamed,
Your answer for quote #2 exactly matches the comment that I wrote in my review.
—————
We are off to a very good start regarding the quotes.
I will attempt to withhold my responses in this article for a little while to give everyone some time to think about the quotes, experiment with some test cases, and share thoughts. There is something very wrong with the first quote, but I will not mention what is very wrong until I publish the full review of the book.
#1 – “Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.”
There are a number of considerations here (probably more than what I’ll mention)
AFAIK, using a larger block size can prevent row chaining, which can, in fact improve access times. Knowing how serious a row chaining problem is or will be before you head down that path is important. Also, are you creating the entire database with a larger block size, or creating a single tablespace with a larger block size? If it’s a single tablespace, then you’re looking at carving out SGA space for buffers of the different size, and these buffers won’t resize dynamically.
#3 – “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN. Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.”
It’s not quite so absolute. It could also be an INDEX SKIP SCAN, and perhaps others that aren’t coming to mind. To jimmyb’s point, I cant say I’ve seen an INDEX FAST FULL SCAN either.
#4 – “The last type [descending indexes] is useful when we want to use queries in which we want the data sorted in a mixed ascending/descending way on the fields indexed. If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this… ORDER BY FIELD1 DESC, FIELD2“
This feels wrong and I’d want to test it out. My guess is that the author is trying to save a sort operation or two, which may be possible if the index is ordered the same as the ORDER BY statement, but Oracle CAN sort some columns ascending and others descending. I’ve seen it.
#6 – “If the value for DEL_LF_ROWS/LF_ROWS is greater than 2, or LF_ROWS is lower than LF_BLKS, or HEIGHT is 4 then the index should be rebuilt.”
The times when an index SHOULD be rebuilt are so few and far between, that it seems unnecessary to even discuss it anymore. If the index is not monotonously increasing (sequence or date) and old data is not deleted from the table, I can’t see any reason for rebuilding an index (save for moving it from tablespace to tablespace).
Just few points
1) We have to read all data when doing the full scan.. I think that block size doesn’t matter here, or the difference is not significant.
what matters is value of db_filemultiblock_read_count (how many block are read by a multiple block read)
There are some disadvantages of using multiple block sizes. T. J. Kiernan mentioned some of them..
I think another disadvantage is that the single block read has to read more data. This could be a problem when using indexes (range scan, table access by rowid are single block operations)..
2&4) function index can be descending index, in fact every descending index is a FBI (see SYS_OP_DESCEND).
“Oracle is unable to do the sort in a mixed way, in a query like this… ORDER BY FIELD1 DESC, FIELD2”
Oracle can omit sorting when there is a “good” index. It don’t need to sort. And of course, it is able to sort in mixed way (without using index) :)..
e.g no sorting took place here:
from predicate section we can see that values are NOT sorted descending in the index.. there is a function SYS_OP_DESCEND in the section
SYS_OP_DESCEND is an internal function that takes a value and returns the form that would be stored for that value in a descending index. Essentially doing a one’s complement on the bytes and appending an 0xFF byte
3) I don’t see any problem here. index is sometimes treated like a slimmer version of corresponding table
5) is vague and not true in general..
Index structure cannot be skewed,. B*Tree is always a balanced tree – all leaf blocks have the same level, but distribution of not-deleted data across the leaf blocks can be skewed .. some leaf block could contain only few non-deleted items while the others could be fully filled with non-deleted items.
What could be a problem is an INSERT and a DELETE in the same transaction.
When we are doing INSERT + commit and DELETE + commit Oracle can use the space marked as deleted, but it cannot reuse it in the same transaction.
Index rebuild is another big topic… I’m not going to discuss it. Richard has written lot of posts about index rebuild’s myths.
6) after reading this I will throw the book to the waste bin.
” or HEIGHT is 4 then the index should be rebuilt”
what if we have so much data, that the index has to have 4 height???? Should we rebuild it every day??? 🙂
7) not true..
firstly, what is a bitmap segment ??
secondly,
bitmap index is a B*Tree index, but the row structure is like this:
value1, low-rowid, hi-rowid, bitmap of occurrence of value1 between low-rowid and hi-rowid
value1, low-rowid, hi-rowid, bitmap of occurrence of value1 between low-rowid and hi-rowid
value2, low-rowid, hi-rowid, bitmap of occurrence of value2 between low-rowid and hi-rowid
….
so value1 could have more entries, not only one.
1) insert of another value1, we have to modify only one bitmap, so all rows mapped by modified bitmap are locked.. rows mapped by the second bitmap are not locked
2) delete – same as the insert, we have to modify only one bitmap, so all rows mapped by this modified bitmap are locked,
3) update from value1 to value2, we have to modify 2 bitmaps, one bitmap for value1 and one bitmap for value2, so all rows mapped by these 2 bitmaps are locked.
3) when referencing leading column operation could be:
RANGE SCAN, UNIQUE SCAN, INDEX FAST FULL SCAN, INDEX FULL SCAN (MIN/MAX)
when not referencing leading column operation could be
INDEX FAST FULL SCAN, INDEX FULL SCAN (MIN/MAX), SKIP SCAN
Hmm.
#3 Page 119:
“… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN. Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.
Sure Now I remember this test back in 2006 on Oracle 10g bases on some stuff from Jonathan’s CBO book. I dug my email and it was as follows as I described in my email
“Oracle has a new feature called Skip Scan Index Access that allows the Oracle optimizer to use a composite index even when the index prefix column in a SQL statement has been omitted. This basically means that you will not have to create an additional index to provide faster access to your data.
However, this is only useful if the cardinality of the prefix column is fairly low.
Briefly Oracle uses the skip scan algorithm to determine the domain of distinct values for the index prefix column and then iterate through each distinct value in this domain. For each distinct value, Oracle will perform a regular index scan on that portion of the index. To put it another way, Oracle will treat the composite index search as a number of small sub-indexes to retrieve the ROWIDs – there will be one sub-index search performed for each distinct value within the index prefix column. The new skip scan index access feature will function as if you issued a sequence of SQL statements having each statement specifying a single value for the index prefix column. In general a skip scan index is faster than a table scan.
Ok a table T5
and the proc to populate it
Now populate it with 10,000 rows
Then back in 2006 I got this from trace
and the trace output on 10g was
Which did an index skip scan.
Now when I tyry the same code on 11g R2 I get (please note I call table T5 now)
Now it does a FTS and finishes quicker. What has changed!
Cheers,
Mich
(Edit: Fixed a couple of pre tags, a not equal evaluation where the less than/greater than combination was lost, and changed the select from T1 to select from T5 so that it was a little easier to execute the test case. CH Feb 23, 2012)
A lot of great answers in this article. I will wait a little longer before providing too many comments. Thanks everyone for keeping the comments positive and for helping others learn how Oracle behaves.
—
Mich,
Jonathan has a clear method approach of explaining index skip scans – thanks for including his statement here.
I wonder if there is a change between 10.2 and 11.2 in terms of costing index skip scans, or if maybe the optimizer parameters/system statistics were a little different in the two Oracle Database versions? I might experiment a bit a little later with the test case script that you provided.
With SET AUTOTRACE ON, the execution plan and statistics (11.2.0.2 mostly default initialization parameters in an ASSM 8KB tablespace):
So, the default execution plan is a full table scan with a calculated cost of 40, with 426 consistent gets.
Hinting an index access path to check the calculated cost:
The calculated cost is 485 with 560 consistent gets, so the table scan has a calculated cost that is about 8.25% of the cost of the index access path.
Let’s adjust the index access cost so that it is 5% of the original value:
We now have an index skip scan without having to hint an index access path.
I realize that the above experiment to obtain an index skip scan is probably obvious to all of the people who have commented in this article. Along the theme of this article, “What If” we had not tried the experiment, we would be perpetuating a “myth” that was actually true for Oracle Database 8.1, but no longer was true when Oracle Database 9.0.1 was released?
Mitch,
This is my result from 10.2.0.5:
When forcing an index access path on 10.2.0.5:
I am wondering if your result from the older database version was possibly impacted by non-default initialization parameters or system statistics that were not set to appropriate values?
Hi Charles,
I wish I could dig that old stuff from 10g but that was 6 years ago.
However, I was doing this test on SSD response recently and I recall the following tests
First increase the number of rows in t5 to 12 million
and then I ran the follwing SQL
I believe the reason the optimizer is using the index is due to 1) access via the leading column N1 and 2) filtering via N2. When I discussed the result of this test (dealing with SSD), with Tom Kyte he actually remarked calling it "a funky query plan ….. – Why introduce a relatively rare index skip scan into the mix?"
The important thing is the cardnilaity of leading column. So if we have a leading column with few distinct values (in this case n1 has 25 values), then the index may be used pretty efficiently by probing it 25 times. On the other hand, if the leading column has relatively high cardinality then an Index Skip Scan is not a viable option.
As a sideline Sybase calls it "intelligent index scan" with the trace output referring to it as "Intelligent Scan selectivity reduction".
Poor DBAs have to remember all these 🙂
Cheers,
Mich
I was busy, so everybody else has already commented on the good stuff. However, I would still like to comment on #6. The recommendation to rebuild index if the level is >=4 is ridiculous. I have a table in my DW database with 1.6G rows. Index level is 6. Do you really think that the index level will drop if I rebuild it? Big tables usually require big indexes. The index should only be rebuilt if there are anomalies in relation to it, like long waits for single block scans on it or suddenly increased duration of the range scans. B*Tree index is a mathematical structure and advising rebuild, based on the sheer height of the index is highly inaccurate.
Mladen,
Good points.
—
The book states, “HEIGHT is 4”, which seems to decrease the number of indexes that would need to be scheduled for rebuilding, and might suggest that your index with a blevel of 6 is OK. What about the other two specifications:
* DEL_LF_ROWS/LF_ROWS is greater than 2
* LF_ROWS is lower than LF_BLKS
Those specifications might be suspect considering what Radoslav stated about how indexes behave during deletes and inserts, if those operations happen in different transactions.
I am currently working on the review of page 127. This morning I submitted 21 errata reports to the publisher’s website, covering the problems that I found in the first 88 pages of the book, and I have 17 additional errata reports ready to be submitted for pages 89 through 127. I will share my review notes for the above quotes once I make it through page 137 (so far the commenters in this article have done a better job than I will be able to do in describing the problems with the quotes due to limited space in the review).
Let’s take a closer look at the rebuild criteria offered in the book.
First, create a test table, populate that table, generate an entry in INDEX_STATS for the index, and display some of the statistics from INDEX_STATS:
There are currently no deleted index entries, and because the index entries were added in sequential order, 90-10 leaf block splits were used. Currently, there are 10,000 rows in the index leafs, 0 rows are deleted, the the height is 2, there are 21 leaf blocks, and 0% of the rows in the index structure are marked as deleted. Based on what the book states, we do not need to rebuild this index.
Continuing the test, we will delete all rows from the index:
There are currently 10,000 rows in the index leafs, 10,000 of those rows are deleted, the height is 2, there are 21 leaf blocks, and 100% of the rows in the index structure are marked as deleted. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 1, which is not greater than 2 (this cannot ever be the case), LF_ROWS is not lower than LF_BLKS (this can never be the case, the block will be detached from the index structure when it is entirely empty), and HEIGHT is not 4 (why 4, does that mean that 2, 3, 5,7, etc. are OK?).
Continuing the test, we will insert a single row into the table, and repopulate INDEX_STATS for the index:
As shown above, inserting a single row into the table cleared out 485 deleted rows (plus 1 for the inserted row) from the index leafs. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 0.9999, which is not greater than 2.
Continuing the test, inserting a row with a value that is 1 greater than the previously deleted maximum value:
As shown above, inserting a single row into the table cleared out an additional 414 deleted rows (plus 1 for the inserted row) from the index leafs. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 0.9998, which is not greater than 2.
Continuing the test, inserting 100 rows into the table, with the values spaced 100 apart (100, 200, 300, etc.):
As shown above, inserting 100 rows into the table cleared out 9101 additional deleted rows (plus 100 for the inserted rowed) from the index leafs. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 0.0000, which is not greater than 2.
This index is now sparsely populated with on average less than 5 rows per leaf block, when it started out with an average of 476.19 rows per leaf block. Should this index be rebuilt? I suppose that depends on how the index will be used in the future, maybe a coalesce is a better choice, maybe doing nothing is a better choice.
The number of comments seems to have slowed for this blog article, so I thought that I would post my book review notes that are related to the above quotes. My book review notes will likely miss some of the points made in the comments above, however, this page will be referenced at the bottom of the completed book review.
#1: The book casually demonstrates setting up a 16KB block size tablespace in a database that has a default 8KB block size. The book provides several advantages for including smaller or larger than default block sizes in a single database including, “Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.” This justification is incorrect for several reasons including the fact that the DB_FILE_MULTIBLOCK_READ_COUNT parameter is scaled accordingly for tablespaces that use a smaller than database default block size, and scales the parameter down for tablespaces that use a larger than database default block size. All of the justifications found on page 88 (temporarily hidden, but the justifications remind me of a very long OTN thread from May/June 2008 that was on a related topic). The book does not discuss the bugs and unexpected optimizer cost changes that might result from using multiple block sizes in a single database
#2: The book states, “We can also create a function-based descending index.” This is a strange statement – all descending indexes in Oracle Database are function-based indexes.
#3: The book states, “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.” In this case, the author is incorrectly attempting to generalize a special case into a general rule. Firstly, there is no myth to dispel – Oracle’s query optimizer has had the ability to use INDEX SKIP SCAN operations when the leading column of an index is not specified in the WHERE clause, since the release of Oracle Database 9.0.1 a decade ago – but that access path is usually only advisable when there are few distinct values in the leading column of the index. The author’s test case is a special case because all of the columns selected from the table are present in the index structure.
#4: The book states, “If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this.” The author then shows a SQL statement with the first column in the ORDER BY clause sorted in descending order and the second column in the ORDER BY clause sorted in ascending order. At this point in the book, the author has not yet stated that Oracle Database is able to read index entries in an ascending or descending order through a normal (ascending sorted) b*tree index, so this sentence in the book is confusing – almost to say that Oracle Database is not able to sort one column in ascending sequence and a second column in descending sequence – that concept is obviously false. It would have been more accurate for the book to state that, “Oracle is unable to _avoid_ a sort operation when accessing the rows through a concatenated index if both of the columns in the index are sorted in ascending sequence, the ORDER BY clause of the SQL statement specifies that one and only one column contained in the index should be ordered in descending sequence, and the second column in the concatenated index is included in the WHERE clause.”
#5: • The book states, “However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.” The book should have defined what is meant by “skewed index structure” – does the book mean, for instance, that one portion of the index could have a BLEVEL of 2 while another portion of the index could have a BLEVEL of 3 – if that is the case, the book’s statement is incorrect. If the book’s definition of “skewed index structure” is that some leaf blocks of the index will be more densely packed than other leaf blocks in the same index structure, then that should be considered normal behavior for Oracle indexes – an occasional coalesce might be used to combine index entries in logically adjacent leaf blocks, but scheduling index rebuilds is neither required nor recommended. Depending on the order of the inserted values in relation to the order of the entries in the index leaf blocks, an index leaf block split operation could evenly divide the existing index entries between two leaf blocks (a 50-50 split, resulting in both index blocks being 50% utilized, if the inserted value is not the highest value that would be inserted into the leaf block), or all of the existing entries will remain in the existing leaf block and the new entry will be placed by itself into a new leaf block (a 90-10 split). A deleted index entry will remain in the block at least until that transaction is committed, but any post-transaction insert into the block will clear out all deleted index entries in the block. Deleting all table rows with index entries at the low end of the index (the values were populated by a sequence, for example, and are deleted in the same sequential order) could leave many blocks in the index structure with nothing but deleted index entries, but that situation should only result in a performance problem if SQL statements attempt to determine the minimum value for the indexed column, or to some extent, fast full index scans and full index scans (reference http://richardfoote.wordpress.com/2011/05/22/del_lf_rows-index-rebuild-criteria-codex/ https://hoopercharles.wordpress.com/2009/12/06/determining-why-a-query-using-mincolumn-in-the-where-clause-on-an-indexed-column-takes-a-long-time/ ). See the test case above.
#6.5 The book states, “Bitmap indexes offer very fast performance when we have a low cardinality field indexed on a table containing many rows.” This statement could have several different interpretations, but I believe that the author’s intended meaning is “Bitmap indexes offer significantly faster performance than b*tree indexes when columns with few distinct values are indexed in tables containing a significant number of rows.” This fixed statement still requires additional clarification – if the bitmap index does not help to further reduce the number of table rows that are accessed through the index, the end result may be performance that is roughly the same as that of b*tree indexes. One way to accomplish the task of further reducing the number of table rows accessed is through the utilization of multiple bitmap indexes with bitmap combine operations to significantly reduce the number of rowids that are used to fetch table rows (page 139).
#7: The book states, “When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.” This statement requires a bit of clarification. I do not believe that the author is stating that updating an entry in a bitmap index will lock all of the bitmap indexes in the database (a segment could be a table, table partition, index, etc.). Instead, I think that the author is intending to state that updating an entry in a bitmap index will lock all of the index entries in that index, effectively preventing any other session from inserting, updating (the column covered by the index), or deleting rows in the table. For very small bitmap indexes, this statement could very well be true. However, for larger bitmap indexes, built for tables with many rows, the number of index rows that will be locked during an update is determined by the number of rows covered by the index block(s) that update changed, possibly 20,000 to 50,000 rows per index block. (page 139 reference: http://www.juliandyke.com/Presentations/BitmapIndexInternals.ppt slide 46, http://www.jlcomp.demon.co.uk/03_bitmap_1.doc page 2, http://laurentschneider.com/wordpress/2007/03/why-is-bitmap-index-not-designed-for-oltp.html comments section)
Charles,
I’ve avoided commenting on the quotes because I had assumed you would get some good follow-up. Personally I thought that the quotes showed the type of woolly thinking that is so often displayed in books about Oracle – vague, ambiguous, generalizations based, to some extent, on a kernel of truth that needed a much clearer description.
One little detail, though: the comment about being able to to create descending function-based indexes may be worded badly, but it is describing a significant point. The following statements do have different effects:
Jonathan,
Thanks for the comments (and for delaying your response).
Regarding your one little point, that is a very good point. I spent little time writing up the review note on that particular item. I quickly came to the conclusion that there was better than 90% chance that the book author did not realized that all descending indexes are function based indexes and moved on to the next paragraph in the book. The review is currently 9 typewritten pages in length for the first 150 pages (lots of screen captures and whitespace between paragraphs, so that would only total about 70 pages in another book without all of the unnecessary screen captures). I am still waiting for my thank you note from the book publisher for the first 21 errata items that I submitted.
I will say that this book author took the time to set up test case scripts, and attempted to show the output of those test case scripts – that is a very big step in the right direction when compared to some of the other books on the market that are targeted at Oracle Database. The author’s written interpretations of the test case outputs, however, ranged from over-generalizations from a single test result, to vague descriptions of what the results show, to just plain misreading of the test case output. There are also a couple of test case scripts that test something other than what the author intended to test (the index rebuild test comes to mind).
I used 11.2.0.2 to test the INDEX SKIP SCAN or INDEX FAST FULL SCAN issue.
Jimmy,
Nice test case example.
I displayed the calculated cost column in the execution plan also, using the following syntax:
I experimented a bit with the first half of your test case script. The full table scan had a calculated cost of 17. Forcing an index access path with an INDEX(T) hint resulted in an execution plan that included and INDEX FULL SCAN operation with a calculated cost of 28. Forcing an index skip scan with an INDEX_SS(T) hint resulted in a calculated cost of 10006 (10005 from the index skip scan operation).
For the second half of your test case script, I obtained a cost of 11 for the INDEX SKIP SCAN access path. When I forced a full table scan with a FULL(T) hint, the calcuated cost was 17.
At least in the first 150 pages of the book, there is no mention of the calculated cost of an access path having an influence on the access path that is selected.
Hi Charles,
Your point
“#3: The book states, “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.” In this case, the author is incorrectly attempting to generalize a special case into a general rule. Firstly, there is no myth to dispel – Oracle’s query optimizer has had the ability to use INDEX SKIP SCAN operations when the leading column of an index is not specified in the WHERE clause, since the release of Oracle Database 9.0.1 a decade ago – but that access path is usually only advisable when there are few distinct values in the leading column of the index. The author’s test case is a special case because all of the columns selected from the table are present in the index structure. ”
Please someone correct me if I am wrong as if there seems to be a bit of confusion about “index skip scan”. Am I correct to state that “index skip scan” is only applicable to cases where the index itself covers the query and the base table need not be touched? In that case I am not sure there is other case where index skip scan can be deployed. If the index cannot cover the query alone then there will not be much gained from “index skip scan”?
Some examples of index skip scan in web are if I use the phrase are naive so to speak. For example many think that Oracle will choose s index skip scan if the leading column is binary type (say male or female). That is not the case. The case is vaild as long as the cardinality of leading column is pretty small.
Cheers,
Mich
Mich,
Jimmyb’s test case above shows an INDEX SKIP SCAN operation that had as a parent operation TABLE ACCESS BY INDEX ROWID
I do not like the phrase “cardinality of leading column” – think that I recently saw a similar phase in the Oracle documentation. To some people (for better or worse, that includes me), that phrase may mean “If I select all of the rows from the table with a particular value specified for the leading column of the index, the optimizer’s predicted number of rows (cardinality) will only be a few rows” – that would imply that there are many different values in the leading column of the index. To other people, that phrase may mean that there are only a couple of unique values in the leading column of the index. I think that it is a bit more clear to state that index skip scans are potentially useful if the number of distinct values in the leading column(s) is low.
The optimizer’s costing of skip scans provides a clue how skip scans are processed. If you just execute the first half of jimmyb’s test case script, you will have a table T with 10,000 distinct values in the leading column of the T_IDX index. If you attempt to force an index skip scan operation in this case, you will probably find that the optimizer calculated a cost for the INDEX SKIP SCAN of approximately 10,005 – which suggests costing of 10,000 sub-index accesses (I have not worked out exactly why the cost is 5 more than 10,000, but the access to the index root block and branch block(s) need to be considered and also the CPU component of the cost):
From the documentation:
http://docs.oracle.com/cd/E11882_01/server.112/e16638/optimops.htm#autoId22
Coming in on the tail end of this:
The behaviour changed from 10gR2 (shown here: http://oratips-ddf.blogspot.com/2008/11/magical-indexes.html) as any query not using the leading column in 11gR2 uses in index skip scan for concatenated indexes should other indexed columns be specified in the where clause (as shown above).
I haven’t the time to pursue the other comments at this point; I agree with Charles’ comments and had the same thoujghts as I read through the list of quotes.