Interesting Index “Facts” – What is Wrong with these Quotes?

22 02 2012

February 22, 2012

There are some amazing (or just plain interesting) facts to be found about how indexes work in Oracle Database.  One Internet site in particular that has been helpful in my understanding of Oracle indexes is Richard Foote’s blog.  I occasionally see information on the Internet, in the Oracle documentation, in Metalink (My Oracle Support), and in books that conflict with one another – just how does one determine what information is correct, and what information is incorrect? 

If you paid money to learn something, should you just assume that what you are learning is correct?  What if test cases are included, should you just assume that the test case is showing what the document claims that the test case shows?  What if, you never ask “What if?” … what if, you do not know when to ask “What if?”

As I mentioned in a previous article, I am in the process of reading and reviewing the book “Oracle Database 11gR2 Performance Tuning Cookbook“, and I seem to be spending quite a bit of time asking What if … or why does this statement not agree with what I believe to be true?  For this blog article, I have pulled seven quotes from the book that will be mentioned in my review of the book – I thought that I would give the readers of this blog an opportunity to ask, “What if”.  As always, I attempted to keep the length of the quotes to a minimum – just long enough so that the context (and meaning) of the quote is not lost.

#1 Page 88:

Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.”

#2 Page 119:

“We create the IX3_CUSTOMERS index to show how we can create a descending index, that is, an index which stores the data for one or more fields in descending order…

We can also create a function-based descending index.”

#3 Page 119:

“… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query.  We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.  Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.

#4 Page 120:

“The last type [descending indexes] is useful when we want to use queries in which we want the data sorted in a mixed ascending/descending way on the fields indexed. If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this…  ORDER BY FIELD1 DESC, FIELD2

#5 Page 127:

“However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.” 

#6 Page 127:

“If the value for DEL_LF_ROWS/LF_ROWS is greater than 2, or LF_ROWS is lower than LF_BLKS, or HEIGHT is 4 then the index should be rebuilt.”

#7 Page 139:

“When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.”

What, if anything, is wrong (and/or right) with the above quotes from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.


Actions

Information

22 responses

22 02 2012
jimmyb

#3 Page 119 – is not accurate in my experience. When you have a composite index and do not include the leading column of that index; Oracle will either A) do a full table scan or, B) use an Index Skip Scan.

I have never seen Oracle do an Index Fast Full Scan when the leading column of an index is missing.

22 02 2012
Charles Hooper

T. J. Kiernan followed up on this same point. I *think* that it could be strongly argued that an INDEX SKIP SCAN would happen much more frequently in this case than an INDEX FAST FULL SCAN. How about someone create a test case to demonstrate what it takes to trigger an INDEX SKIP SCAN and what it takes for an INDEX FAST FULL SCAN to appear in the same situation.

If you wish to post test case code here, please use a <pre> tag before the section and a </pre> tag after the section:
<pre>

SELECT
  1
FROM
  DUAL;

</pre>

22 02 2012
Houri Mohamed

“When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.”

When rows are frequently inserted, deleted, and updated (in multi-user-concurrent application) bitmap indexes should not be used at all.

“We can also create a function-based descending index”
If an index is created in a descending order it will be automatically of a function based index type

“However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.”

The space will be used by the subsequent insert (excluding direct path insert wich inserts above the HWM). Skewed indexes depends on the patern of the delete. If we delete an index on its left side which means it will be growing only from its right side (sequence based indexes and timestamp based indexes), index could be smached or be what tom kyte call “sweeper indexes”. And even in it that case, I think that caolescing that index is what could be advised and not rebuilding the index. There are cases where index could benefit from rebuild (for that see jonathan lewis script) but in most cases, we don’t have to rebuild indexes

22 02 2012
Charles Hooper

Mohamed,

Your answer for quote #2 exactly matches the comment that I wrote in my review.

—————

We are off to a very good start regarding the quotes.

I will attempt to withhold my responses in this article for a little while to give everyone some time to think about the quotes, experiment with some test cases, and share thoughts. There is something very wrong with the first quote, but I will not mention what is very wrong until I publish the full review of the book.

22 02 2012
T. J. Kiernan

#1 – “Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.”
There are a number of considerations here (probably more than what I’ll mention)
AFAIK, using a larger block size can prevent row chaining, which can, in fact improve access times. Knowing how serious a row chaining problem is or will be before you head down that path is important. Also, are you creating the entire database with a larger block size, or creating a single tablespace with a larger block size? If it’s a single tablespace, then you’re looking at carving out SGA space for buffers of the different size, and these buffers won’t resize dynamically.

#3 – “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN. Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.”

It’s not quite so absolute. It could also be an INDEX SKIP SCAN, and perhaps others that aren’t coming to mind. To jimmyb’s point, I cant say I’ve seen an INDEX FAST FULL SCAN either.

#4 – “The last type [descending indexes] is useful when we want to use queries in which we want the data sorted in a mixed ascending/descending way on the fields indexed. If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this… ORDER BY FIELD1 DESC, FIELD2“

This feels wrong and I’d want to test it out. My guess is that the author is trying to save a sort operation or two, which may be possible if the index is ordered the same as the ORDER BY statement, but Oracle CAN sort some columns ascending and others descending. I’ve seen it.

#6 – “If the value for DEL_LF_ROWS/LF_ROWS is greater than 2, or LF_ROWS is lower than LF_BLKS, or HEIGHT is 4 then the index should be rebuilt.”

The times when an index SHOULD be rebuilt are so few and far between, that it seems unnecessary to even discuss it anymore. If the index is not monotonously increasing (sequence or date) and old data is not deleted from the table, I can’t see any reason for rebuilding an index (save for moving it from tablespace to tablespace).

22 02 2012
Radoslav Golian

Just few points

1) We have to read all data when doing the full scan.. I think that block size doesn’t matter here, or the difference is not significant.
what matters is value of db_filemultiblock_read_count (how many block are read by a multiple block read)

There are some disadvantages of using multiple block sizes. T. J. Kiernan mentioned some of them..
I think another disadvantage is that the single block read has to read more data. This could be a problem when using indexes (range scan, table access by rowid are single block operations)..

2&4) function index can be descending index, in fact every descending index is a FBI (see SYS_OP_DESCEND).

“Oracle is unable to do the sort in a mixed way, in a query like this… ORDER BY FIELD1 DESC, FIELD2”
Oracle can omit sorting when there is a “good” index. It don’t need to sort. And of course, it is able to sort in mixed way (without using index) :)..

e.g no sorting took place here:

drop table t purge;
 
Table dropped
create table t (a number not null, b number not null, c char(100));
 
Table created
create index i_desc on t(a+b desc, b-a);
 
Index created
explain plan for select a+b from t where a+b = 2 order by a+b desc, b-a asc;
 
Explained
select * from table(dbms_xplan.display());
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1719408291
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |    39 |     0   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| I_DESC |     1 |    39 |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access(SYS_OP_DESCEND("A"+"B")=HEXTORAW('3EFCFF') )
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("A"+"B"))=2)
Note
-----
   - dynamic sampling used for this statement (level=2)
 
18 rows selected

from predicate section we can see that values are NOT sorted descending in the index.. there is a function SYS_OP_DESCEND in the section

SYS_OP_DESCEND is an internal function that takes a value and returns the form that would be stored for that value in a descending index. Essentially doing a one’s complement on the bytes and appending an 0xFF byte

3) I don’t see any problem here. index is sometimes treated like a slimmer version of corresponding table

drop table t purge;
 
Table dropped
create table t (a number not null, b number not null, c char(100));
 
Table created
create index i_a_b on t(a,b);
 
Index created
insert into t select level, level, 'x' from dual connect by level <= 10000;
 
10000 rows inserted
commit;
 
Commit complete
explain plan for select b from t where b > 0;
 
Explained
select * from table(dbms_xplan.display());
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2748957719
------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |  9276 |   117K|     9   (0)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| I_A_B |  9276 |   117K|     9   (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("B">0)
Note
-----
   - dynamic sampling used for this statement (level=2)
 
17 rows selected
rollback;
 
Rollback complete

5) is vague and not true in general..

Index structure cannot be skewed,. B*Tree is always a balanced tree – all leaf blocks have the same level, but distribution of not-deleted data across the leaf blocks can be skewed .. some leaf block could contain only few non-deleted items while the others could be fully filled with non-deleted items.

What could be a problem is an INSERT and a DELETE in the same transaction.

When we are doing INSERT + commit and DELETE + commit Oracle can use the space marked as deleted, but it cannot reuse it in the same transaction.

Index rebuild is another big topic… I’m not going to discuss it. Richard has written lot of posts about index rebuild’s myths.

6) after reading this I will throw the book to the waste bin.

” or HEIGHT is 4 then the index should be rebuilt”
what if we have so much data, that the index has to have 4 height???? Should we rebuild it every day??? 🙂

7) not true..
firstly, what is a bitmap segment ??

secondly,
bitmap index is a B*Tree index, but the row structure is like this:

value1, low-rowid, hi-rowid, bitmap of occurrence of value1 between low-rowid and hi-rowid
value1, low-rowid, hi-rowid, bitmap of occurrence of value1 between low-rowid and hi-rowid
value2, low-rowid, hi-rowid, bitmap of occurrence of value2 between low-rowid and hi-rowid
….

so value1 could have more entries, not only one.
1) insert of another value1, we have to modify only one bitmap, so all rows mapped by modified bitmap are locked.. rows mapped by the second bitmap are not locked
2) delete – same as the insert, we have to modify only one bitmap, so all rows mapped by this modified bitmap are locked,
3) update from value1 to value2, we have to modify 2 bitmaps, one bitmap for value1 and one bitmap for value2, so all rows mapped by these 2 bitmaps are locked.

22 02 2012
Radoslav Golian

3) when referencing leading column operation could be:
RANGE SCAN, UNIQUE SCAN, INDEX FAST FULL SCAN, INDEX FULL SCAN (MIN/MAX)
when not referencing leading column operation could be
INDEX FAST FULL SCAN, INDEX FULL SCAN (MIN/MAX), SKIP SCAN

23 02 2012
talebzadeh

Hmm.

#3 Page 119:

“… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN. Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.

Sure Now I remember this test back in 2006 on Oracle 10g bases on some stuff from Jonathan’s CBO book. I dug my email and it was as follows as I described in my email

“Oracle has a new feature called Skip Scan Index Access that allows the Oracle optimizer to use a composite index even when the index prefix column in a SQL statement has been omitted. This basically means that you will not have to create an additional index to provide faster access to your data.

However, this is only useful if the cardinality of the prefix column is fairly low.

Briefly Oracle uses the skip scan algorithm to determine the domain of distinct values for the index prefix column and then iterate through each distinct value in this domain. For each distinct value, Oracle will perform a regular index scan on that portion of the index. To put it another way, Oracle will treat the composite index search as a number of small sub-indexes to retrieve the ROWIDs – there will be one sub-index search performed for each distinct value within the index prefix column. The new skip scan index access feature will function as if you issued a sequence of SQL statements having each statement specifying a single value for the index prefix column. In general a skip scan index is faster than a table scan.

Ok a table T5

CREATE TABLE T5
                (
                        n1 NUMBER(5)    NOT NULL,
                        ind_pad VARCHAR2(40) NOT NULL,
                        n2 NUMBER(5)    NOT NULL,
                        small_vc VARCHAR2(10) NOT NULL,
                        padding  VARCHAR2(200) NOT NULL
                 );

create index t5_i1 on t5(n1, ind_pad, n2)
nologging
pctfree 91;

and the proc to populate it

CREATE OR REPLACE PROCEDURE populate_T5_sp
(
        i_rows  IN              NUMBER
)
AS
  v_rowcount                    NUMBER(12) := 0;
  v_T5Id                        NUMBER(37) := NULL;
  e_rows                        EXCEPTION;
  e_insert_T5                   EXCEPTION;
BEGIN
  BEGIN                 -- Input parameter validation
    -- Check for inputs
    IF i_rows <= 0
    THEN
      RAISE e_rows;
    END IF;
  EXCEPTION
    WHEN e_rows THEN
      DBMS_OUTPUT.PUT_LINE('Cannot have null value for no of records to generate!');
  END;
  BEGIN
    dbms_random.seed(0);
    SELECT NVL(MAX(small_vc),0)+1 INTO v_T5Id from T5;
    for v_records IN 1..i_rows LOOP
      INSERT INTO T5
      (
        n1,
        ind_pad,
        n2,
        small_vc,
        padding
      )
      VALUES
      (
        trunc(dbms_random.value(0,25)),
        rpad('x',39)||'x',
        trunc(dbms_random.value(0,20)),
        lpad(v_T5Id,10,'0'),
        rpad('x',199)||'x'
      );
      IF SQL%ROWCOUNT = 0 OR SQLCODE <> 0
      THEN
        RAISE e_insert_T5;
      END IF;
      v_rowcount := v_rowcount + 1;
      v_T5Id := v_T5Id + 1;
      --dbms_output.put_line(chr(10)||'v_T5Id = '|| v_T5Id);
      IF MOD(v_rowcount, 10000) = 0 THEN
        --dbms_output.put_line('v_rowcount = '|| v_rowcount ||' ,committing!');
        COMMIT;
      END IF;
    END LOOP;
    COMMIT;
  EXCEPTION
    WHEN e_insert_T5 THEN
      DBMS_OUTPUT.PUT_LINE('Could not insert into T5');
    WHEN OTHERS THEN
      RAISE_APPLICATION_ERROR(SQLCODE,SQLERRM,true);
  END;
END populate_T5_sp;
/

Now populate it with 10,000 rows

 exec populate_t5_sp(10000)

PL/SQL procedure successfully completed.

 EXEC DBMS_STATS.GATHER_TABLE_STATS(ownname=>user,tabname=>'T5', cascade=>true);

PL/SQL procedure successfully completed.

Then back in 2006 I got this from trace

select  small_vc from T5 where n2 = 2;

and the trace output on 10g was

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3265611721
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   470 |  6580 |   537   (1)| 00:00:07 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |   470 |  6580 |   537   (1)| 00:00:07 |
|*  2 |   INDEX SKIP SCAN           | T1_I1 |   470 |       |    77   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"=2)
       filter("N2"=2)

Which did an index skip scan.

Now when I tyry the same code on 11g R2 I get (please note I call table T5 now)

Execution Plan
----------------------------------------------------------
Plan hash value: 2002323537

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500 |  7000 |   136   (0)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T5   |   500 |  7000 |   136   (0)| 00:00:02 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("N2"=2)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        426  consistent gets
          0  physical reads
          0  redo size
      12034  bytes sent via SQL*Net to client
        864  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        470  rows processed

Now it does a FTS and finishes quicker. What has changed!

Cheers,

Mich

(Edit: Fixed a couple of pre tags, a not equal evaluation where the less than/greater than combination was lost, and changed the select from T1 to select from T5 so that it was a little easier to execute the test case. CH Feb 23, 2012)

23 02 2012
Charles Hooper

A lot of great answers in this article. I will wait a little longer before providing too many comments. Thanks everyone for keeping the comments positive and for helping others learn how Oracle behaves.

Mich,
Jonathan has a clear method approach of explaining index skip scans – thanks for including his statement here.

I wonder if there is a change between 10.2 and 11.2 in terms of costing index skip scans, or if maybe the optimizer parameters/system statistics were a little different in the two Oracle Database versions? I might experiment a bit a little later with the test case script that you provided.

With SET AUTOTRACE ON, the execution plan and statistics (11.2.0.2 mostly default initialization parameters in an ASSM 8KB tablespace):

select  small_vc from T5 where n2 = 2;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2002323537
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   470 |  6580 |    40   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T5   |   470 |  6580 |    40   (0)| 00:00:01 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=2)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        426  consistent gets
          0  physical reads
          0  redo size
       8788  bytes sent via SQL*Net to client
        701  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        470  rows processed

So, the default execution plan is a full table scan with a calculated cost of 40, with 426 consistent gets.

Hinting an index access path to check the calculated cost:

select /*+ INDEX(T5) */  small_vc from T5 where n2 = 2;
  
Execution Plan
----------------------------------------------------------
Plan hash value: 3362028299
 
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   470 |  6580 |   485   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5    |   470 |  6580 |   485   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | T5_I1 |   470 |       |    26   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"=2)
       filter("N2"=2)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        560  consistent gets
          0  physical reads
          0  redo size
       8788  bytes sent via SQL*Net to client
        701  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        470  rows processed

The calculated cost is 485 with 560 consistent gets, so the table scan has a calculated cost that is about 8.25% of the cost of the index access path.

Let’s adjust the index access cost so that it is 5% of the original value:

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;
 
select  small_vc from T5 where n2 = 2;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3362028299
 
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   470 |  6580 |    24   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5    |   470 |  6580 |    24   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | T5_I1 |   470 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"=2)
       filter("N2"=2)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        560  consistent gets
          0  physical reads
          0  redo size
       8788  bytes sent via SQL*Net to client
        701  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        470  rows processed

We now have an index skip scan without having to hint an index access path.

I realize that the above experiment to obtain an index skip scan is probably obvious to all of the people who have commented in this article. Along the theme of this article, “What If” we had not tried the experiment, we would be perpetuating a “myth” that was actually true for Oracle Database 8.1, but no longer was true when Oracle Database 9.0.1 was released?

23 02 2012
Charles Hooper

Mitch,

This is my result from 10.2.0.5:

Execution Plan
----------------------------------------------------------
Plan hash value: 2002323537
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500 |  7000 |    40   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T5   |   500 |  7000 |    40   (0)| 00:00:01 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=2)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        425  consistent gets
          0  physical reads
          0  redo size
      12023  bytes sent via SQL*Net to client
        813  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        470  rows processed

When forcing an index access path on 10.2.0.5:

Execution Plan
----------------------------------------------------------
Plan hash value: 3362028299
 
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   500 |  7000 |   514   (0)| 00:00:05 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5    |   500 |  7000 |   514   (0)| 00:00:05 |
|*  2 |   INDEX SKIP SCAN           | T5_I1 |   500 |       |    26   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N2"=2)
       filter("N2"=2)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        569  consistent gets
          0  physical reads
          0  redo size
      12023  bytes sent via SQL*Net to client
        813  bytes received via SQL*Net from client
         33  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        470  rows processed

I am wondering if your result from the older database version was possibly impacted by non-default initialization parameters or system statistics that were not set to appropriate values?

24 02 2012
Mich talebzadeh

Hi Charles,

I wish I could dig that old stuff from 10g but that was 6 years ago.

However, I was doing this test on SSD response recently and I recall the following tests

First increase the number of rows in t5 to 12 million

exec populate_t5_sp(12000000)

PL/SQL procedure successfully completed.

and then I ran the follwing SQL

  1  SELECT count(1)
  2  FROM T5
  3  where   n1 <= 8
  4  and     ind_pad <= rpad('x',39)||'x'
  5* and     n2       /


Execution Plan
----------------------------------------------------------
Plan hash value: 2226148900

--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |     1 |    47 | 35558   (1)| 00:07:07 |
|   1 |  SORT AGGREGATE  |       |     1 |    47 |            |          |
|*  2 |   INDEX SKIP SCAN| T5_I1 |  3064K|   137M| 35558   (1)| 00:07:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N1"<=8 AND "IND_PAD"<='x
                   x' AND "N2"<15)
       filter("N2"<15 AND "IND_PAD"<='x
                   x')


Statistics
----------------------------------------------------------
        384  recursive calls
          0  db block gets
      38124  consistent gets
          0  physical reads
          0  redo size
        529  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          9  sorts (memory)
          0  sorts (disk)
          1  rows processed

I believe the reason the optimizer is using the index is due to 1) access via the leading column N1 and 2) filtering via N2. When I discussed the result of this test (dealing with SSD), with Tom Kyte he actually remarked calling it "a funky query plan ….. – Why introduce a relatively rare index skip scan into the mix?"

The important thing is the cardnilaity of leading column. So if we have a leading column with few distinct values (in this case n1 has 25 values), then the index may be used pretty efficiently by probing it 25 times. On the other hand, if the leading column has relatively high cardinality then an Index Skip Scan is not a viable option.

As a sideline Sybase calls it "intelligent index scan" with the trace output referring to it as "Intelligent Scan selectivity reduction".

Poor DBAs have to remember all these 🙂

Cheers,

Mich

23 02 2012
Mladen Gogala

I was busy, so everybody else has already commented on the good stuff. However, I would still like to comment on #6. The recommendation to rebuild index if the level is >=4 is ridiculous. I have a table in my DW database with 1.6G rows. Index level is 6. Do you really think that the index level will drop if I rebuild it? Big tables usually require big indexes. The index should only be rebuilt if there are anomalies in relation to it, like long waits for single block scans on it or suddenly increased duration of the range scans. B*Tree index is a mathematical structure and advising rebuild, based on the sheer height of the index is highly inaccurate.

23 02 2012
Charles Hooper

Mladen,

Good points.

The book states, “HEIGHT is 4”, which seems to decrease the number of indexes that would need to be scheduled for rebuilding, and might suggest that your index with a blevel of 6 is OK. What about the other two specifications:
* DEL_LF_ROWS/LF_ROWS is greater than 2
* LF_ROWS is lower than LF_BLKS

Those specifications might be suspect considering what Radoslav stated about how indexes behave during deletes and inserts, if those operations happen in different transactions.

I am currently working on the review of page 127. This morning I submitted 21 errata reports to the publisher’s website, covering the problems that I found in the first 88 pages of the book, and I have 17 additional errata reports ready to be submitted for pages 89 through 127. I will share my review notes for the above quotes once I make it through page 137 (so far the commenters in this article have done a better job than I will be able to do in describing the problems with the quotes due to limited space in the review).

24 02 2012
Charles Hooper

Let’s take a closer look at the rebuild criteria offered in the book.

First, create a test table, populate that table, generate an entry in INDEX_STATS for the index, and display some of the statistics from INDEX_STATS:

DROP TABLE T1 PURGE;
 
CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  RPAD('A',200,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;
 
CREATE INDEX IND_T1_C1 ON T1(C1);
 
ANALYZE INDEX IND_T1_C1 VALIDATE STRUCTURE;
 
SELECT
  LF_ROWS,
  DEL_LF_ROWS,
  HEIGHT,
  LF_BLKS,
  ROUND(DEL_LF_ROWS/LF_ROWS*100,2) PERCENT
FROM
  INDEX_STATS;
 
LF_ROWS DEL_LF_ROWS     HEIGHT    LF_BLKS    PERCENT
------- ----------- ---------- ---------- ----------
  10000           0          2         21          0

There are currently no deleted index entries, and because the index entries were added in sequential order, 90-10 leaf block splits were used. Currently, there are 10,000 rows in the index leafs, 0 rows are deleted, the the height is 2, there are 21 leaf blocks, and 0% of the rows in the index structure are marked as deleted. Based on what the book states, we do not need to rebuild this index.

Continuing the test, we will delete all rows from the index:

DELETE FROM T1;
 
ANALYZE INDEX IND_T1_C1 VALIDATE STRUCTURE;
 
SELECT
  LF_ROWS,
  DEL_LF_ROWS,
  HEIGHT,
  LF_BLKS,
  ROUND(DEL_LF_ROWS/LF_ROWS*100,2) PERCENT
FROM
  INDEX_STATS;
 
LF_ROWS DEL_LF_ROWS     HEIGHT    LF_BLKS    PERCENT
------- ----------- ---------- ---------- ----------
  10000       10000          2         21        100

There are currently 10,000 rows in the index leafs, 10,000 of those rows are deleted, the height is 2, there are 21 leaf blocks, and 100% of the rows in the index structure are marked as deleted. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 1, which is not greater than 2 (this cannot ever be the case), LF_ROWS is not lower than LF_BLKS (this can never be the case, the block will be detached from the index structure when it is entirely empty), and HEIGHT is not 4 (why 4, does that mean that 2, 3, 5,7, etc. are OK?).

Continuing the test, we will insert a single row into the table, and repopulate INDEX_STATS for the index:

INSERT INTO T1 VALUES (1,'A');
 
COMMIT;
 
ANALYZE INDEX IND_T1_C1 VALIDATE STRUCTURE;
 
SELECT
  LF_ROWS,
  DEL_LF_ROWS,
  HEIGHT,
  LF_BLKS,
  ROUND(DEL_LF_ROWS/LF_ROWS*100,2) PERCENT
FROM
  INDEX_STATS;
 
LF_ROWS DEL_LF_ROWS     HEIGHT    LF_BLKS    PERCENT
------- ----------- ---------- ---------- ----------
   9516        9515          2         21      99.99

As shown above, inserting a single row into the table cleared out 485 deleted rows (plus 1 for the inserted row) from the index leafs. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 0.9999, which is not greater than 2.

Continuing the test, inserting a row with a value that is 1 greater than the previously deleted maximum value:

INSERT INTO T1 VALUES (10001,'A');
 
COMMIT;
 
ANALYZE INDEX IND_T1_C1 VALIDATE STRUCTURE;
 
SELECT
  LF_ROWS,
  DEL_LF_ROWS,
  HEIGHT,
  LF_BLKS,
  ROUND(DEL_LF_ROWS/LF_ROWS*100,2) PERCENT
FROM
  INDEX_STATS;
 
LF_ROWS DEL_LF_ROWS     HEIGHT    LF_BLKS    PERCENT
------- ----------- ---------- ---------- ----------
   9103        9101          2         21      99.98

As shown above, inserting a single row into the table cleared out an additional 414 deleted rows (plus 1 for the inserted row) from the index leafs. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 0.9998, which is not greater than 2.

Continuing the test, inserting 100 rows into the table, with the values spaced 100 apart (100, 200, 300, etc.):

INSERT INTO
  T1 
SELECT
  ROWNUM*100 C1,
  RPAD('A',200,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=100;
 
COMMIT;
 
ANALYZE INDEX IND_T1_C1 VALIDATE STRUCTURE;
 
SELECT
  LF_ROWS,
  DEL_LF_ROWS,
  HEIGHT,
  LF_BLKS,
  ROUND(DEL_LF_ROWS/LF_ROWS*100,2) PERCENT
FROM
  INDEX_STATS;
 
LF_ROWS DEL_LF_ROWS     HEIGHT    LF_BLKS    PERCENT
------- ----------- ---------- ---------- ----------
    102           0          2         21          0

As shown above, inserting 100 rows into the table cleared out 9101 additional deleted rows (plus 100 for the inserted rowed) from the index leafs. Based on what the book states, we do not need to rebuild this index. DEL_LF_ROWS/LF_ROWS = 0.0000, which is not greater than 2.

This index is now sparsely populated with on average less than 5 rows per leaf block, when it started out with an average of 476.19 rows per leaf block. Should this index be rebuilt? I suppose that depends on how the index will be used in the future, maybe a coalesce is a better choice, maybe doing nothing is a better choice.

25 02 2012
Charles Hooper

The number of comments seems to have slowed for this blog article, so I thought that I would post my book review notes that are related to the above quotes. My book review notes will likely miss some of the points made in the comments above, however, this page will be referenced at the bottom of the completed book review.

#1: The book casually demonstrates setting up a 16KB block size tablespace in a database that has a default 8KB block size. The book provides several advantages for including smaller or larger than default block sizes in a single database including, “Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.” This justification is incorrect for several reasons including the fact that the DB_FILE_MULTIBLOCK_READ_COUNT parameter is scaled accordingly for tablespaces that use a smaller than database default block size, and scales the parameter down for tablespaces that use a larger than database default block size. All of the justifications found on page 88 (temporarily hidden, but the justifications remind me of a very long OTN thread from May/June 2008 that was on a related topic). The book does not discuss the bugs and unexpected optimizer cost changes that might result from using multiple block sizes in a single database

#2: The book states, “We can also create a function-based descending index.” This is a strange statement – all descending indexes in Oracle Database are function-based indexes.

#3: The book states, “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.” In this case, the author is incorrectly attempting to generalize a special case into a general rule. Firstly, there is no myth to dispel – Oracle’s query optimizer has had the ability to use INDEX SKIP SCAN operations when the leading column of an index is not specified in the WHERE clause, since the release of Oracle Database 9.0.1 a decade ago – but that access path is usually only advisable when there are few distinct values in the leading column of the index. The author’s test case is a special case because all of the columns selected from the table are present in the index structure.

#4: The book states, “If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this.” The author then shows a SQL statement with the first column in the ORDER BY clause sorted in descending order and the second column in the ORDER BY clause sorted in ascending order. At this point in the book, the author has not yet stated that Oracle Database is able to read index entries in an ascending or descending order through a normal (ascending sorted) b*tree index, so this sentence in the book is confusing – almost to say that Oracle Database is not able to sort one column in ascending sequence and a second column in descending sequence – that concept is obviously false. It would have been more accurate for the book to state that, “Oracle is unable to _avoid_ a sort operation when accessing the rows through a concatenated index if both of the columns in the index are sorted in ascending sequence, the ORDER BY clause of the SQL statement specifies that one and only one column contained in the index should be ordered in descending sequence, and the second column in the concatenated index is included in the WHERE clause.”

#5: • The book states, “However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.” The book should have defined what is meant by “skewed index structure” – does the book mean, for instance, that one portion of the index could have a BLEVEL of 2 while another portion of the index could have a BLEVEL of 3 – if that is the case, the book’s statement is incorrect. If the book’s definition of “skewed index structure” is that some leaf blocks of the index will be more densely packed than other leaf blocks in the same index structure, then that should be considered normal behavior for Oracle indexes – an occasional coalesce might be used to combine index entries in logically adjacent leaf blocks, but scheduling index rebuilds is neither required nor recommended. Depending on the order of the inserted values in relation to the order of the entries in the index leaf blocks, an index leaf block split operation could evenly divide the existing index entries between two leaf blocks (a 50-50 split, resulting in both index blocks being 50% utilized, if the inserted value is not the highest value that would be inserted into the leaf block), or all of the existing entries will remain in the existing leaf block and the new entry will be placed by itself into a new leaf block (a 90-10 split). A deleted index entry will remain in the block at least until that transaction is committed, but any post-transaction insert into the block will clear out all deleted index entries in the block. Deleting all table rows with index entries at the low end of the index (the values were populated by a sequence, for example, and are deleted in the same sequential order) could leave many blocks in the index structure with nothing but deleted index entries, but that situation should only result in a performance problem if SQL statements attempt to determine the minimum value for the indexed column, or to some extent, fast full index scans and full index scans (reference http://richardfoote.wordpress.com/2011/05/22/del_lf_rows-index-rebuild-criteria-codex/ https://hoopercharles.wordpress.com/2009/12/06/determining-why-a-query-using-mincolumn-in-the-where-clause-on-an-indexed-column-takes-a-long-time/ ). See the test case above.

#6.5 The book states, “Bitmap indexes offer very fast performance when we have a low cardinality field indexed on a table containing many rows.” This statement could have several different interpretations, but I believe that the author’s intended meaning is “Bitmap indexes offer significantly faster performance than b*tree indexes when columns with few distinct values are indexed in tables containing a significant number of rows.” This fixed statement still requires additional clarification – if the bitmap index does not help to further reduce the number of table rows that are accessed through the index, the end result may be performance that is roughly the same as that of b*tree indexes. One way to accomplish the task of further reducing the number of table rows accessed is through the utilization of multiple bitmap indexes with bitmap combine operations to significantly reduce the number of rowids that are used to fetch table rows (page 139).

#7: The book states, “When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.” This statement requires a bit of clarification. I do not believe that the author is stating that updating an entry in a bitmap index will lock all of the bitmap indexes in the database (a segment could be a table, table partition, index, etc.). Instead, I think that the author is intending to state that updating an entry in a bitmap index will lock all of the index entries in that index, effectively preventing any other session from inserting, updating (the column covered by the index), or deleting rows in the table. For very small bitmap indexes, this statement could very well be true. However, for larger bitmap indexes, built for tables with many rows, the number of index rows that will be locked during an update is determined by the number of rows covered by the index block(s) that update changed, possibly 20,000 to 50,000 rows per index block. (page 139 reference: http://www.juliandyke.com/Presentations/BitmapIndexInternals.ppt slide 46, http://www.jlcomp.demon.co.uk/03_bitmap_1.doc page 2, http://laurentschneider.com/wordpress/2007/03/why-is-bitmap-index-not-designed-for-oltp.html comments section)

25 02 2012
Jonathan Lewis

Charles,

I’ve avoided commenting on the quotes because I had assumed you would get some good follow-up. Personally I thought that the quotes showed the type of woolly thinking that is so often displayed in books about Oracle – vague, ambiguous, generalizations based, to some extent, on a kernel of truth that needed a much clearer description.

One little detail, though: the comment about being able to to create descending function-based indexes may be worded badly, but it is describing a significant point. The following statements do have different effects:

create index t1_f1 on t1(sqrt(n1));
create index t1_f1 on t1(sqrt(n1) desc));
25 02 2012
Charles Hooper

Jonathan,

Thanks for the comments (and for delaying your response).

Regarding your one little point, that is a very good point. I spent little time writing up the review note on that particular item. I quickly came to the conclusion that there was better than 90% chance that the book author did not realized that all descending indexes are function based indexes and moved on to the next paragraph in the book. The review is currently 9 typewritten pages in length for the first 150 pages (lots of screen captures and whitespace between paragraphs, so that would only total about 70 pages in another book without all of the unnecessary screen captures). I am still waiting for my thank you note from the book publisher for the first 21 errata items that I submitted.

I will say that this book author took the time to set up test case scripts, and attempted to show the output of those test case scripts – that is a very big step in the right direction when compared to some of the other books on the market that are targeted at Oracle Database. The author’s written interpretations of the test case outputs, however, ranged from over-generalizations from a single test result, to vague descriptions of what the results show, to just plain misreading of the test case output. There are also a couple of test case scripts that test something other than what the author intended to test (the index rebuild test comes to mind).

25 02 2012
jimmyb

I used 11.2.0.2 to test the INDEX SKIP SCAN or INDEX FAST FULL SCAN issue.

SQL> create table t
 ( id   number ,
   code number ,
   txt  varchar2(25)
 )
 / 

Table created.
 
 SQL> insert into t
 select   rownum
        , MOD(ROWNUM,100)
        , dbms_random.string('A',25)
 from
          dual
 connect by level < 10000;
 
commit;

Commit complete.

SQL> create index t_idx on t(id,code) ;

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

SQL> alter session set statistics_level=all;

Session altered.

SQL> set serveroutput off

SQL> select * from t where code = 1;

100 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  c6mtjhmq58ng4, child number 0
-------------------------------------
select * from t where code = 1

Plan hash value: 1601196873

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |    100 |00:00:00.01 |70 |
|*  1 |  TABLE ACCESS FULL| T    |      1 |    100 |    100 |00:00:00.01 |70 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CODE"=1)

-- This is what I expected - FTS. Now let's only reference the indexed columns in the select clause
SQL> select id, code from t where code = 1;

100 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  718d4y9b3fqtz, child number 1
-------------------------------------
update CRC$_RESULT_CACHE_STATS                   
    set NAME = :1, 
        VALUE = :2 
    where CACHE_ID = :3 
    and   STAT_ID = :4

Plan hash value: 1807565214

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name                    | Starts | E-Rows | A-Rows|   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                         |     10 |        |      0|00:00:00.01 |      21 |
|   1 |  UPDATE            | CRC$_RESULT_CACHE_STATS |     10 |        |      0|00:00:00.01 |      21 |
|*  2 |   INDEX UNIQUE SCAN| CRC$_RCSTATSPK          |     10 |      1 |     10|00:00:00.01 |      10 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CACHE_ID"=:3 AND "STAT_ID"=:4)

-- Oracle updates the the CLIENT RESULT CACHE
-- If we don't modify the table data the Optimizer should be able to use this view

SQL> select id, code from t where code = 1;

100 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  dhv1zshgrubgv, child number 0
-------------------------------------
select id, code from t where code = 1

Plan hash value: 3163761342

----------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |      1 |        |    100 |00:00:00.01 |    38 |
|*  1 |  INDEX FAST FULL SCAN| T_IDX |      1 |    100 |    100 |00:00:00.01 |    38 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CODE"=1)

-- This makes sense. When all of the columns in a select cause can be satisfied Oracle is able to a INDEX FAST FULL SCAN

-- ############NOW LET's DROP and RECREATE THE TABLE SO the LEADING COLUMN HAS A LOWER CARDINALITY. AS SOMEONE PROVED EARLIER THIS SHOULD GET AN INDEX SKIP SCAN

SQL> drop table t purge;

Table dropped.
 
SQL> create table t
 ( id   number ,
   code number ,
   txt  varchar2(25)
 )
 / 

Table created.

SQL> insert into t
 select   TRUNC(DBMS_RANDOM.VALUE(0, 9))
        , ROWNUM
        , dbms_random.string('A',25)
 from
          dual
 connect by level < 10000;
 
commit;

Commit complete.

SQL> select distinct id from t;

        ID
----------
         1
         6
         2
         5
         4
         8
         3
         7
         0

9 rows selected.

-- ######SO THE LEADING COLUMN IN OUR COMPOSITE INDEX ONLY HAS 9 DISTINCT VALUES

SQL> create index t_idx on t(id,code);

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

SQL> select * from t where code = 90;

        ID       CODE TXT
---------- ---------- -------------------------
         2         90 YMZUHsDykWENIyGekAtrQCqBZ

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Plan hash value: 2053318169

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |      1 |        |      1 |00:00:00.01 |      14 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |      1 |      1 |      1 |00:00:00.01 |      14 |
|*  2 |   INDEX SKIP SCAN           | T_IDX |      1 |      1 |      1 |00:00:00.01 |      13 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CODE"=90)
       filter("CODE"=90)
25 02 2012
Charles Hooper

Jimmy,

Nice test case example.

I displayed the calculated cost column in the execution plan also, using the following syntax:

select * from table(dbms_xplan.display_cursor(null,null,'allstats last +cost'));

I experimented a bit with the first half of your test case script. The full table scan had a calculated cost of 17. Forcing an index access path with an INDEX(T) hint resulted in an execution plan that included and INDEX FULL SCAN operation with a calculated cost of 28. Forcing an index skip scan with an INDEX_SS(T) hint resulted in a calculated cost of 10006 (10005 from the index skip scan operation).

For the second half of your test case script, I obtained a cost of 11 for the INDEX SKIP SCAN access path. When I forced a full table scan with a FULL(T) hint, the calcuated cost was 17.

At least in the first 150 pages of the book, there is no mention of the calculated cost of an access path having an influence on the access path that is selected.

26 02 2012
Mich talebzadeh

Hi Charles,

Your point

“#3: The book states, “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.” In this case, the author is incorrectly attempting to generalize a special case into a general rule. Firstly, there is no myth to dispel – Oracle’s query optimizer has had the ability to use INDEX SKIP SCAN operations when the leading column of an index is not specified in the WHERE clause, since the release of Oracle Database 9.0.1 a decade ago – but that access path is usually only advisable when there are few distinct values in the leading column of the index. The author’s test case is a special case because all of the columns selected from the table are present in the index structure. ”

Please someone correct me if I am wrong as if there seems to be a bit of confusion about “index skip scan”. Am I correct to state that “index skip scan” is only applicable to cases where the index itself covers the query and the base table need not be touched? In that case I am not sure there is other case where index skip scan can be deployed. If the index cannot cover the query alone then there will not be much gained from “index skip scan”?

Some examples of index skip scan in web are if I use the phrase are naive so to speak. For example many think that Oracle will choose s index skip scan if the leading column is binary type (say male or female). That is not the case. The case is vaild as long as the cardinality of leading column is pretty small.

Cheers,

Mich

26 02 2012
Charles Hooper

Mich,

Jimmyb’s test case above shows an INDEX SKIP SCAN operation that had as a parent operation TABLE ACCESS BY INDEX ROWID

I do not like the phrase “cardinality of leading column” – think that I recently saw a similar phase in the Oracle documentation. To some people (for better or worse, that includes me), that phrase may mean “If I select all of the rows from the table with a particular value specified for the leading column of the index, the optimizer’s predicted number of rows (cardinality) will only be a few rows” – that would imply that there are many different values in the leading column of the index. To other people, that phrase may mean that there are only a couple of unique values in the leading column of the index. I think that it is a bit more clear to state that index skip scans are potentially useful if the number of distinct values in the leading column(s) is low.

The optimizer’s costing of skip scans provides a clue how skip scans are processed. If you just execute the first half of jimmyb’s test case script, you will have a table T with 10,000 distinct values in the leading column of the T_IDX index. If you attempt to force an index skip scan operation in this case, you will probably find that the optimizer calculated a cost for the INDEX SKIP SCAN of approximately 10,005 – which suggests costing of 10,000 sub-index accesses (I have not worked out exactly why the cost is 5 more than 10,000, but the access to the index root block and branch block(s) need to be considered and also the CPU component of the cost):

select /*+ index_ss(t) */ * from t where code = 1;
 
select * from table(dbms_xplan.display_cursor(null,null,'allstats last +cost'));
 
SQL_ID  22ru4c22nssxb, child number 0
-------------------------------------
select /*+ index_ss(t) */ * from t where code = 1
 
Plan hash value: 2053318169
 
------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |      1 |        | 10006 (100)|    100 |00:00:00.01 |      97 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |      1 |    100 | 10006   (1)|    100 |00:00:00.01 |      97 |
|*  2 |   INDEX SKIP SCAN           | T_IDX |      1 |    100 | 10005   (1)|    100 |00:00:00.01 |      41 |
------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CODE"=1)
       filter("CODE"=1)

From the documentation:
http://docs.oracle.com/cd/E11882_01/server.112/e16638/optimops.htm#autoId22

Index skip scans improve index scans by nonprefix columns. Often, scanning index blocks is faster than scanning table data blocks.

Skip scanning lets a composite index be split logically into smaller subindexes. In skip scanning, the initial column of the composite index is not specified in the query. In other words, it is skipped.

The database determines the number of logical subindexes by the number of distinct values in the initial column. Skip scanning is advantageous when there are few distinct values in the leading column of the composite index and many distinct values in the nonleading key of the index.

The database may choose an index skip scan when the leading column of the composite index is not specified in a query predicate.

29 02 2012
David Fitzjarrell (@ddfdba)

Coming in on the tail end of this:

SQL> 
SQL> --
SQL> -- I  creates a index indx1 on (C1,C2,C3). I issue three select
SQL> -- statements like :
SQL> --
SQL> -- 1.select * from T1 where C1=
SQL> -- 2.select * from T1 where C2=
SQL> -- 3.select * from T1 where C3=
SQL> -- 4.select * from T1 where C1=  and  C2=  and C3 =
SQL> --
SQL> --
SQL> --
SQL> --
SQL> -- Regards,
SQL> -- Sanjoy
SQL> --
SQL> 
SQL> --
SQL> -- Create the table in question
SQL> --
SQL> create table t1(
  2  	   c1 number,
  3  	   c2 varchar2(20),
  4  	   c3 date,
  5  	   c4 varchar2(10),
  6  	   c5 number,
  7  	   c6 number
  8  );

Table created.

SQL> 
SQL> 
SQL> --
SQL> -- Create the index specified
SQL> --
SQL> create index indx1
  2  on t1(c1,c2,c3);

Index created.

SQL> 
SQL> 
SQL> --
SQL> -- Load test data
SQL> --
SQL> 
SQL> --
SQL> -- Data with unique C1, C2 and C3 values
SQL> --
SQL> begin
  2  	   for i in 1..10101 loop
  3  	    insert into t1
  4  	    values (i, 'Testing record '||i, trunc(sysdate+i), 'Filler', mod(i, 734), mod(i, 963));
  5  	   end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> 
SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> 
SQL> --
SQL> -- 'Standard' statistics
SQL> --
SQL> exec dbms_stats.gather_schema_stats(ownname=>'BING');

PL/SQL procedure successfully completed.

SQL> 
SQL> 
SQL> 
SQL> set autotrace on linesize 132
SQL> 
SQL> select * From t1 where c1 =433;

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
       433 Testing record 433   07-MAY-13 Filler            433        433


Execution Plan
----------------------------------------------------------
Plan hash value: 552572096

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | INDX1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C1"=433)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        739  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * From t1 where c2 = 'Testing record 7748';

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
      7748 Testing record 7748  17-MAY-33 Filler            408         44


Execution Plan
----------------------------------------------------------
Plan hash value: 3325179317

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |    11   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |    11   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | INDX1 |     1 |       |    11   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C2"='Testing record 7748')
       filter("C2"='Testing record 7748')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         59  consistent gets
          0  physical reads
          0  redo size
        739  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * from t1 where c3 = trunc(sysdate+9);

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
         9 Testing record 9     09-MAR-12 Filler              9          9


Execution Plan
----------------------------------------------------------
Plan hash value: 3325179317

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |    11   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |    11   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | INDX1 |     1 |       |    11   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C3"=TRUNC(SYSDATE@!+9))
       filter("C3"=TRUNC(SYSDATE@!+9))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         59  consistent gets
          0  physical reads
          0  redo size
        734  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * From t1 where c1 = 9993 and c2 = 'Testing record 9993' and c3 = trunc(sysdate+9993);

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
      9993 Testing record 9993  10-JUL-39 Filler            451        363


Execution Plan
----------------------------------------------------------
Plan hash value: 552572096

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | INDX1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C1"=9993 AND "C2"='Testing record 9993' AND
              "C3"=TRUNC(SYSDATE@!+9993))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        740  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> set autotrace off
SQL> 
SQL> --
SQL> -- Statistics with auto-sized histograms on indexed columns
SQL> --
SQL> exec dbms_stats.gather_schema_stats(ownname=>'BING', method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

SQL> 
SQL> set autotrace on linesize 132
SQL> 
SQL> select * From t1 where c1 =433;

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
       433 Testing record 433   07-MAY-13 Filler            433        433


Execution Plan
----------------------------------------------------------
Plan hash value: 552572096

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | INDX1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C1"=433)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        739  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * From t1 where c2 = 'Testing record 7748';

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
      7748 Testing record 7748  17-MAY-33 Filler            408         44


Execution Plan
----------------------------------------------------------
Plan hash value: 3325179317

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |    11   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |    11   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | INDX1 |     1 |       |    11   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C2"='Testing record 7748')
       filter("C2"='Testing record 7748')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         59  consistent gets
          0  physical reads
          0  redo size
        739  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * from t1 where c3 = trunc(sysdate+9);

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
         9 Testing record 9     09-MAR-12 Filler              9          9


Execution Plan
----------------------------------------------------------
Plan hash value: 3325179317

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |    11   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |    11   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | INDX1 |     1 |       |    11   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C3"=TRUNC(SYSDATE@!+9))
       filter("C3"=TRUNC(SYSDATE@!+9))


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         59  consistent gets
          0  physical reads
          0  redo size
        734  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> select * From t1 where c1 = 9993 and c2 = 'Testing record 9993' and c3 = trunc(sysdate+9993);

        C1 C2                   C3        C4                 C5         C6
---------- -------------------- --------- ---------- ---------- ----------
      9993 Testing record 9993  10-JUL-39 Filler            451        363


Execution Plan
----------------------------------------------------------
Plan hash value: 552572096

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    47 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |    47 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | INDX1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("C1"=9993 AND "C2"='Testing record 9993' AND
              "C3"=TRUNC(SYSDATE@!+9993))


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        740  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 

The behaviour changed from 10gR2 (shown here: http://oratips-ddf.blogspot.com/2008/11/magical-indexes.html) as any query not using the leading column in 11gR2 uses in index skip scan for concatenated indexes should other indexed columns be specified in the where clause (as shown above).

I haven’t the time to pursue the other comments at this point; I agree with Charles’ comments and had the same thoujghts as I read through the list of quotes.

Leave a reply to Mich talebzadeh Cancel reply