The INSTR Function will Never Use an Index? I will give You a Hint

1 06 2010

June 1, 2010

As regular readers probably know, I frequently read books on various computer related topics.  Last winter I pre-ordered the book “Oracle SQL Recipes”, and have been reading the book on and off since that time.  Some parts of the book are absolutely fantastic, and others leave me scratching my head – did I read that right?  Page 169 of the book describes the INSTR and LIKE functions.  The book makes the following statement:

LIKE can be more readable, and may even use an index if there are no pattern-matching characters at the beginning of the search string. INSTR will never use an existing index unless you have created a function-based index containing the exact INSTR clause that you used in the WHERE clause.

The above paragraph conveys a lot of good information, and certainly the authors have tested that the above paragraph is correct.  Repeat after me: The INSTR function will never use a non-function based index!  The INSTR function will never use a non-function based index.  The INSTR function will never use a non-function based index?

I sense that some people are not sure that the above is true.  Let’s put together a quick test table for a test:

CREATE TABLE T5 (
  CHAR_COL VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (CHAR_COL));

INSERT INTO
  T5
SELECT
  TO_CHAR(ROWNUM),
  RPAD('A',100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL <=1000000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T5',CASCADE=>TRUE)

Now that we have a test table with a primary key index, we need a test script:

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  *
FROM
  T5
WHERE
  CHAR_COL LIKE '99999%';

SELECT
  *
FROM
  T5
WHERE
  INSTR(CHAR_COL,'99999') = 1;

SELECT
  *
FROM
  T5
WHERE
  SUBSTR(CHAR_COL,1,5) = '99999';

SELECT /*+ INDEX(T5) */
  *
FROM
  T5
WHERE
  INSTR(CHAR_COL,'99999') = 1;

SELECT /*+ INDEX(T5) */
  *
FROM
  T5
WHERE
  SUBSTR(CHAR_COL,1,5) = '99999';

SELECT /*+ INDEX(T5) CARDINALITY(T5 11) */
  *
FROM
  T5
WHERE
  INSTR(CHAR_COL,'99999') = 1;

SELECT /*+ INDEX(T5) CARDINALITY(T5 11) */
  *
FROM
  T5
WHERE
  SUBSTR(CHAR_COL,1,5) = '99999';

If the above script is executed on Oracle Database 11.2.0.1, which of the above seven SQL statements will use the primary key index?

Before you cry fowl, think about the question and the expected execution plans.

Wild turkeys, probably not what you would expect to find on a blog that is about Oracle Databases.  Yet seemingly appropriate.  I found this group of turkeys wandering in my field a year or two ago.  It is amazing how close I was able to get to the birds – with a 20x zoom camera.

Now that you have had a chance to think about it, and you have probably seen this blog article, which, if any, of the execution plans will show that the primary key index was used?  Here is the output of the script:

SQL> SELECT
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    CHAR_COL LIKE '99999%'; 

Execution Plan
----------------------------------------------------------
Plan hash value: 260357324

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   108 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5           |     1 |   108 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | SYS_C0023201 |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CHAR_COL" LIKE '99999%')
       filter("CHAR_COL" LIKE '99999%')

---

SQL> SELECT
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    INSTR(CHAR_COL,'99999') = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 2002323537

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  1054K|  4407   (1)| 00:00:53 |
|*  1 |  TABLE ACCESS FULL| T5   | 10000 |  1054K|  4407   (1)| 00:00:53 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(INSTR("CHAR_COL",'99999')=1)

---

SQL> SELECT
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    SUBSTR(CHAR_COL,1,5) = '99999';

Execution Plan
----------------------------------------------------------
Plan hash value: 2002323537

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  1054K|  4422   (1)| 00:00:54 |
|*  1 |  TABLE ACCESS FULL| T5   | 10000 |  1054K|  4422   (1)| 00:00:54 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(SUBSTR("CHAR_COL",1,5)='99999')

---

SQL> SELECT /*+ INDEX(T5) */
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    INSTR(CHAR_COL,'99999') = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1277447555

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              | 10000 |  1054K|  5935   (1)| 00:01:12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5           | 10000 |  1054K|  5935   (1)| 00:01:12 |
|*  2 |   INDEX FULL SCAN           | SYS_C0023201 | 10000 |       |  3922   (1)| 00:00:48 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(INSTR("CHAR_COL",'99999')=1)

---

SQL> SELECT /*+ INDEX(T5) */
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    SUBSTR(CHAR_COL,1,5) = '99999';

Execution Plan
----------------------------------------------------------
Plan hash value: 1277447555

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              | 10000 |  1054K|  5950   (1)| 00:01:12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5           | 10000 |  1054K|  5950   (1)| 00:01:12 |
|*  2 |   INDEX FULL SCAN           | SYS_C0023201 | 10000 |       |  3937   (1)| 00:00:48 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SUBSTR("CHAR_COL",1,5)='99999')

---

SQL> SELECT /*+ INDEX(T5) CARDINALITY(T5 11) */
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    INSTR(CHAR_COL,'99999') = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 1277447555

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |    11 |  1188 |  5935   (1)| 00:01:12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5           |    11 |  1188 |  5935   (1)| 00:01:12 |
|*  2 |   INDEX FULL SCAN           | SYS_C0023201 | 10000 |       |  3922   (1)| 00:00:48 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(INSTR("CHAR_COL",'99999')=1)

---

SQL> SELECT /*+ INDEX(T5) CARDINALITY(T5 11) */
  2    *
  3  FROM
  4    T5
  5  WHERE
  6    SUBSTR(CHAR_COL,1,5) = '99999';

Execution Plan
----------------------------------------------------------
Plan hash value: 1277447555

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |    11 |  1188 |  5950   (1)| 00:01:12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T5           |    11 |  1188 |  5950   (1)| 00:01:12 |
|*  2 |   INDEX FULL SCAN           | SYS_C0023201 | 10000 |       |  3937   (1)| 00:00:48 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SUBSTR("CHAR_COL",1,5)='99999')

So, the answer to the question, “the INSTR function will never use a non-function based index?”  False, if an index use is hinted.  When an index hint is provided an index full scan is used to read the index, which reads the index blocks one at a time.  This could be a time consuming operation compared to a full table scan if all block accesses result in physical reads and the average row length for the table is not terribly long.  Consider with an 8KB block size, Oracle should be able to read up to 128 blocks (depending on extent size, OS limits, blocks already in the buffer cache, and DB_FILE_MULTIBLOCK_READ_COUNT value) in a single read request during the full table scan in about the same amount of time as would be required to read a single index block from disk during the index full scan operation (unless index pre-fetch kicks in to read multiple index blocks in a single read request).

So, yes we can use an index with the INSTR function, but would we really want to do that?  Maybe on rare occasions, but not as a regular practice.


Actions

Information

2 responses

9 06 2010
Martin Preiss

Charles,
I guess the index would also be used, if we use only the column CHAR_COL in the SELECT list. In fact I do not quess but see the INDEX FAST FULL scan, when I use your test case with this adaption (and without a hint).

9 06 2010
Charles Hooper

Martin,

Thank you for mentioning that it is possible for Oracle to use an INDEX FAST FULL scan for this SQL statement is only the CHAR_COL column is included in the SELECT (I cannot recall if I tried that in my environment, but it certainly is nice that Oracle will use the index). I certainly have learned a lot from test cases like the one in this thread.

Leave a reply to Charles Hooper Cancel reply