Optimizer Costing 3 – What is Wrong with this Quote?

6 12 2010

December 6, 2010

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

I recently reviewed the book “Oracle Tuning the Definitive Reference Second Edition”, and did not provide an in-depth technical review of the entire book.  This blog article series will dig into some of the pages that were not specifically included in the review.  What is wrong with the following quote from page 922 of the book?

Try a rule hint!

For testing unnecessary large table full table scans, try a rule hint (select /*+ RULE */ col1).  If the query uses the index with a rule hint, you have an issue with the CBO.”

What, if anything, is wrong with the above quote from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true.  It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past).  If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.

As I stated in a comment in the earlier blog article, I would like to clarify that my review on the Amazon site is not intended to suggest that people should not buy the book. As the publisher’s website states that the book is written for senior Oracle DBAs, I suggest that senior DBAs, Oracle Certified Masters, and other people who are able to research the book’s contents *should* buy the book and post reviews of the book that highlight specific content of the book indicating whether or not that content is correctly stated. Such feedback will be a benefit to the Oracle community, and could help to improve Oracle Database books that are published in the future. I think that we need to keep a positive outlook in order to see things improve.

Other pages found during a Google search of the phrase:

Related Oracle Database documentation:


Actions

Information

7 responses

6 12 2010
fidelinho

> If the query uses the index with a rule hint, you have an issue with the CBO.
he should try to look at any SAP ERP system
Most of the index start with the field MANDT.
Most of the queries get the field automagically added by the DBI.
Rule #8 will kick in and the index used.
This is one of the historical reasons why the field is added to the index and the queries.
Most of the time the MANDT has 1 value.
RULE will use an index even when only the MANDT is on the index, even when the complete index is read and then the complete table, while the CBO will use a FTS in this case without any issue with the CBO

6 12 2010
7 12 2010
Chris Saxon

I think the documentation you’ve linked to shows a big contradiction in the quote. This states that if an index is used when adding the rule hint, the CBO is wrong if it chooses a FTS.

However, by rule 15 of the documentation:

“This means that the RBO always chooses an access path that uses an index if one is available, even if a full table scan might execute faster.”

So Oracle must use an index (if possible) on the RBO, even if this is an inefficient method of execution…

6 12 2010
Charles Hooper

fidelinho,

Interesting example, thanks for the links to the documentation.

What other kinds of problems might we see with the advice to add the RULE hint? Any warning from the documentation for 11.2 (see the hint reference in the article)?

Might this AskTom article also provide some insight into the issue?

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::p11_question_id:3310002340673

7 12 2010
Charles Hooper

This question might be more difficult to answer than I first thought. To help you get started, consider the following paraphrase from the book “Practical Oracle 8i”:
“The RULE based optimizer will not use descending indexes, must make certain that the cost based optimizer is used.”

At first that might seem like an odd problem, and kind of makes you wonder what the rule of thumb is for function based indexes, bitmap indexes, bitmap joins of B*tree indexes, partitioning, parallel query, table join order, etc. What does AskTom say?

7 12 2010
Charles Hooper

Roughly 10 people looked at the AskTom article, but no replies. So, I thought that I would provide a short demonstration why it is a bad, and probably a silly idea to suggest that people use a RULE hint. If you check the hint reference for Oracle Database 11.2 I believe that you will find that the hint is no longer listed.

Creating a simple table for the first set of testing (Oracle Database 11.2.0.1):

CREATE TABLE T5 (
  C1 NUMBER,
  C2 NUMBER,
  C3 NUMBER,
  C4 NUMBER,
  C5 NUMBER,
  C6 NUMBER,
  C7 NUMBER,
  C8 NUMBER,
  C9 VARCHAR2(20),
  C10 VARCHAR2(20));

INSERT INTO
  T5
SELECT
  ROWNUM,
  ROWNUM,
  MOD(ROWNUM,50),
  MOD(ROWNUM,40),
  MOD(ROWNUM,100),
  ROUND(ROWNUM/1000),
  ROUND(ROWNUM/950),
  ROUND(ROWNUM/600),
  RPAD('',20,CHR(65 + MOD(ROWNUM-1,26))),
  RPAD('',20,CHR(65 + MOD(ROWNUM,26)))
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

Now the first sample SQL statement:

SET AUTOTRACE TRACEONLY EXPLAIN
 
SELECT
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2002323537

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    54 |  2106 |   396   (3)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T5   |    54 |  2106 |   396   (3)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C1">=1 AND "C1"<=1000)

Note
-----
   - dynamic sampling used for this statement (level=2)

Nothing special about the above, dynamic sampling was used because there are no statistics on the table. Now the same query with the RULE hint:

SELECT /*+ RULE */
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
---------------------------------------------------
Plan hash value: 2002323537

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| T5   |
----------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C1"<=1000 AND "C1">=1)

Note
-----
   - rule based optimizer used (consider using cbo)

No notice of dynamic sampling, but we do see a note about the rule based optimizer being used.

Now, let’s create several indexes and collect statistics for the table and indexes:

CREATE INDEX IND_T5_C1 ON T5(C1);
CREATE INDEX IND_T5_C2 ON T5(C2 DESC);
CREATE BITMAP INDEX IND_T5_C3 ON T5(C3);
CREATE BITMAP INDEX IND_T5_C4 ON T5(C4);
CREATE BITMAP INDEX IND_T5_C5 ON T5(C5);
CREATE BITMAP INDEX IND_T5_C6 ON T5(C6);
CREATE BITMAP INDEX IND_T5_C7 ON T5(C7);
CREATE BITMAP INDEX IND_T5_C8 ON T5(C8);
CREATE INDEX IND_T5_C9_F ON T5(LOWER(C9));
CREATE INDEX IND_T5_C10_F ON T5(LOWER(C10));

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T5',CASCADE=>TRUE)

Let’s try the original query again:

SELECT
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C1 BETWEEN 1 AND 1000;

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  1000 | 13000 |    10   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |  1000 | 13000 |    10   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C1 |  1000 |       |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=1 AND "C1"<=1000)

SELECT /*+ RULE */
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
---------------------------------------------------
Plan hash value: 2655107216

-------------------------------------------------
| Id  | Operation                   | Name      |
-------------------------------------------------
|   0 | SELECT STATEMENT            |           |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C1 |
-------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=1 AND "C1"<=1000)

Note
-----
   - rule based optimizer used (consider using cbo)

In both cases, Oracle will use the index on column C1. Let’s try putting column C2 in the WHERE clause:

SELECT
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C2 BETWEEN 1 AND 1000;
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3443248787
 
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  1000 | 13000 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |  1000 | 13000 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(SYS_OP_DESCEND("C2")>=HEXTORAW('3DF4FF')  AND
              SYS_OP_DESCEND("C2")<=HEXTORAW('3EFDFF') )
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("C2"))<=1000 AND
              SYS_OP_UNDESCEND(SYS_OP_DESCEND("C2"))>=1)

SELECT /*+ RULE */
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C2 BETWEEN 1 AND 1000;
 
Execution Plan
---------------------------------------------------
Plan hash value: 2002323537
 
----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| T5   |
----------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"<=1000 AND "C2">=1)

Note
-----
   - rule based optimizer used (consider using cbo)

In this case, the cost based optimizer will permit the descending index on column C2 to be used, while the RULE based optimizer will not – the book is silent on what to do if fewer indexes are used when the RULE based optimizer is used.

Let’s try another test SQL statement:

SELECT
  C3,
  C4,
  C5
FROM
  T5
WHERE
  C3=1
  AND C4=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1255170387

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   500 |  4500 |   104   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | T5        |   500 |  4500 |   104   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |       |       |            |          |
|   3 |    BITMAP AND                |           |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE| IND_T5_C3 |       |       |            |          |
|*  5 |     BITMAP INDEX SINGLE VALUE| IND_T5_C4 |       |       |            |          |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("C3"=1)
   5 - access("C4"=1)

SELECT /*+ RULE */
  C3,
  C4,
  C5
FROM
  T5
WHERE
  C3=1
  AND C4=1;

Execution Plan
---------------------------------------------------
Plan hash value: 2002323537

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| T5   |
----------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C4"=1 AND "C3"=1)

Note
-----
   - rule based optimizer used (consider using cbo)

This time the cost based optimizer selects to use two of the bitmap indexes, while the RULE based optimizer decides to perform a full table scan. In addition to not being able to utilize descending indexes, the RULE based optimizer is not able to use bitmap indexes.

Let’s try another query:

SELECT
  *
FROM
  T5
WHERE
  LOWER(C9)=LPAD('a',20,'a');

Execution Plan
----------------------------------------------------------
Plan hash value: 2456366630

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    38 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5          |     1 |    38 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C9_F |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(LOWER("C9")='aaaaaaaaaaaaaaaaaaaa')

SELECT /*+ RULE */
  *
FROM
  T5
WHERE
  LOWER(C9)=LPAD('a',20,'a');

Execution Plan
---------------------------------------------------
Plan hash value: 2002323537

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| T5   |
----------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(LOWER("C9")='aaaaaaaaaaaaaaaaaaaa')

Note
-----
   - rule based optimizer used (consider using cbo)

Once again the cost based optimizer decided to use an index, in this case a function based index (descending indexes are also function based indexes), while the RULE based optimizer decided to use a full table scan because it cannot use function based indexes.

Let’s try another query with an index hint:

SELECT /*+ INDEX(T5) */
  LOWER(C9)
FROM
  T5;

Execution Plan
----------------------------------------------------------
Plan hash value: 3179786097

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |  1000K|    11M|  2997   (1)| 00:00:24 |
|   1 |  TABLE ACCESS BY INDEX ROWID | T5        |  1000K|    11M|  2997   (1)| 00:00:24 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |       |       |            |          |
|   3 |    BITMAP INDEX FULL SCAN    | IND_T5_C6 |       |       |            |          |
------------------------------------------------------------------------------------------

SELECT /*+ INDEX(T5) RULE */
  LOWER(C9)
FROM
  T5;

Execution Plan
---------------------------------------------------
Plan hash value: 2002323537

----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| T5   |
----------------------------------

Note
-----
   - rule based optimizer used (consider using cbo)

Once again, the cost based optimizer used an index (why not IND_T5_C9_F – even hinting that index will not work) while the RULE based optimizer performed a full table scan.

Now, let’s assume that someone set a default parallel degree for the table:

ALTER TABLE T5 PARALLEL(DEGREE 8);

Let’s rerun the first SQL statement:

SELECT
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2655107216

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  1000 | 13000 |    10   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |  1000 | 13000 |    10   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C1 |  1000 |       |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=1 AND "C1"<=1000)

SELECT /*+ RULE */
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2655107216

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  1000 | 13000 |    10   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |  1000 | 13000 |    10   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C1 |  1000 |       |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=1 AND "C1"<=1000)

It is good to see that the query, even with the RULE hint, is using the index on column C1, but wait, Cost figures do not display when the RULE optimizer mode is used, and also note that the warning in the Note section did not print – the cost based optimizer was used even with the RULE hint because of the default parallel degree specified for the table.

Let’s try the next query:

SELECT
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C2 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 3443248787

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  1000 | 13000 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |  1000 | 13000 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(SYS_OP_DESCEND("C2")>=HEXTORAW('3DF4FF')  AND
              SYS_OP_DESCEND("C2")<=HEXTORAW('3EFDFF') )
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("C2"))<=1000 AND
              SYS_OP_UNDESCEND(SYS_OP_DESCEND("C2"))>=1)

SELECT /*+ RULE */
  C1,
  C2,
  C3
FROM
  T5
WHERE
  C2 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 3443248787

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  1000 | 13000 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5        |  1000 | 13000 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C2 |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(SYS_OP_DESCEND("C2")>=HEXTORAW('3DF4FF')  AND
              SYS_OP_DESCEND("C2")<=HEXTORAW('3EFDFF') )
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("C2"))<=1000 AND
              SYS_OP_UNDESCEND(SYS_OP_DESCEND("C2"))>=1)

The cost based optimizer was used in both cases.

Another of the queries:

SELECT
  C3,
  C4,
  C5
FROM
  T5
WHERE
  C3=1
  AND C4=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 404792702

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |   500 |  4500 |    55   (4)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |   500 |  4500 |    55   (4)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          |   500 |  4500 |    55   (4)| 00:00:01 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T5       |   500 |  4500 |    55   (4)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("C3"=1 AND "C4"=1)

SELECT /*+ RULE */
  C3,
  C4,
  C5
FROM
  T5
WHERE
  C3=1
  AND C4=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 404792702

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |   500 |  4500 |    55   (4)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |   500 |  4500 |    55   (4)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          |   500 |  4500 |    55   (4)| 00:00:01 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T5       |   500 |  4500 |    55   (4)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("C3"=1 AND "C4"=1)

Again, the cost based optimizer was used in both cases.

Another query:

SELECT
  *
FROM
  T5
WHERE
  LOWER(C9)=LPAD('a',20,'a');

Execution Plan
----------------------------------------------------------
Plan hash value: 2456366630

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    38 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5          |     1 |    38 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C9_F |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(LOWER("C9")='aaaaaaaaaaaaaaaaaaaa')

SELECT /*+ RULE */
  *
FROM
  T5
WHERE
  LOWER(C9)=LPAD('a',20,'a');

Execution Plan
----------------------------------------------------------
Plan hash value: 2456366630

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    38 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5          |     1 |    38 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T5_C9_F |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(LOWER("C9")='aaaaaaaaaaaaaaaaaaaa')

Cost based again with the RULE hint.

For fun, let’s create a partitioned table to allow a couple of more tests:

CREATE TABLE T6 (
  C1 NUMBER,
  C2 NUMBER,
  C3 NUMBER,
  C4 NUMBER,
  C5 NUMBER,
  C6 NUMBER,
  C7 NUMBER,
  C8 NUMBER,
  C9 VARCHAR2(20),
  C10 VARCHAR2(20))
PARTITION BY RANGE(C1)
   (PARTITION P1 VALUES LESS THAN (5),
    PARTITION P2 VALUES LESS THAN (10),
    PARTITION P3 VALUES LESS THAN (20),
    PARTITION P4 VALUES LESS THAN (40),
    PARTITION P5 VALUES LESS THAN (80),
    PARTITION P6 VALUES LESS THAN (160),
    PARTITION P7 VALUES LESS THAN (320),
    PARTITION P8 VALUES LESS THAN (640),
    PARTITION P9 VALUES LESS THAN (1280),
    PARTITION P10 VALUES LESS THAN (2560),
    PARTITION P11 VALUES LESS THAN (5120),
    PARTITION P12 VALUES LESS THAN (10240),
    PARTITION P20 VALUES LESS THAN (MAXVALUE));

INSERT INTO
  T6
SELECT
  ROWNUM,
  ROWNUM,
  MOD(ROWNUM,50),
  MOD(ROWNUM,40),
  MOD(ROWNUM,100),
  ROUND(ROWNUM/1000),
  ROUND(ROWNUM/950),
  ROUND(ROWNUM/600),
  RPAD('',20,CHR(65 + MOD(ROWNUM-1,26))),
  RPAD('',20,CHR(65 + MOD(ROWNUM,26)))
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

CREATE INDEX IND_T6_C1 ON T6(C1);
CREATE INDEX IND_T6_C2 ON T6(C2 DESC);
CREATE INDEX IND_T6_C9_F ON T6(LOWER(C9));
CREATE INDEX IND_T6_C10_F ON T6(LOWER(C10));

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T6',CASCADE=>TRUE)

Let’s try the first query again:

SELECT
  C1,
  C2,
  C3
FROM
  T6
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2728409382

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |  1000 | 13000 |     3   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |  1000 | 13000 |     3   (0)| 00:00:01 |     1 |     9 |
|*  2 |   TABLE ACCESS FULL      | T6   |  1000 | 13000 |     3   (0)| 00:00:01 |     1 |     9 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("C1"<=1000 AND "C1">=1)

SELECT /*+ RULE */
  C1,
  C2,
  C3
FROM
  T6
WHERE
  C1 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2728409382

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |  1000 | 13000 |     3   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|      |  1000 | 13000 |     3   (0)| 00:00:01 |     1 |     9 |
|*  2 |   TABLE ACCESS FULL      | T6   |  1000 | 13000 |     3   (0)| 00:00:01 |     1 |     9 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("C1"<=1000 AND "C1">=1)

The query with the RULE hint actually used the cost based optimizer because the table is partitioned.

The above is one benefit of a test case, and hopefully that explains why it is a very bad idea to suggest to someone that they try the RULE hint as one of the initial steps in tuning a query.

13 12 2010
Catmando

Okay, now this one gets me riled up. Why? Let me start with:

“For testing unnecessary large table full table scans”

What does that mean? What is an unnecessary scan? Obviously with a selectivity of 1 you can expect a full scan. With a selectivity approaching 0 you are more likely to see an index but no guarantees. It would seem that if you have labeled something as “unnecessary” then you have already passed judgement on the CBO and know you “have an issue” with it.

Version support aside, once you start to intervene, you have to be very careful and try to do the least intervention possible and it would seem that more modern solutions like:

* Histograms
* DYNAMIC_SAMPLING hint

should be entertained before suggesting a hint like that. At least those solutions adapt to changes in the data where some other hints do not. Any more I have very few problems with Oracle getting single table access correct and for most data, it’s pretty good at getting multiple tables right. My focus is to help Oracle get the cardinality estimate right (thanks Jonathan and Wolfgang).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 147 other followers

%d bloggers like this: