What is the Difference Between the FIRST_ROWS Hint and ROWNUM in the WHERE Clause?

10 03 2011

March 10, 2011

A couple of days ago I saw an interesting question that asked what is the difference between using the FIRST_ROWS(100) hint in a SQL statement and using ROWNUM<=100 in the WHERE clause with the original SQL statement moved to an inline view. For example:

SELECT /*+ FIRST_ROWS(100) */
  C1,
  C2
FROM
  T1
WHERE
  C1 BETWEEN 1 AND 1000
ORDER BY
  C1;

SELECT
  *
FROM
  (SELECT
     C1,
     C2
   FROM
     T1
   WHERE
     C1 BETWEEN 1 AND 1000
   ORDER BY
     C1)
WHERE
  ROWNUM<=100;

My first thought was that the difference is actually quite simple to explain, and then I started to wonder, how could someone logically come to the conclusion that the two approaches would yield the same results? Do developers look at explain plan output? Could that explain how someone might look at these two approaches and question whether or not the two approaches are equivalent? Let’s see:

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT /*+ FIRST_ROWS(100) */
  C1,
  C2
FROM
  T1
WHERE
  C1 BETWEEN 1 AND 1000
ORDER BY
  C1;

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |   102 |  1020 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |   102 |  1020 |     5   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | SYS_C0020704 |  1000 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=1 AND "C1"<=1000)

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  *
FROM
  (SELECT
     C1,
     C2
   FROM
     T1
   WHERE
     C1 BETWEEN 1 AND 1000
   ORDER BY
     C1)
WHERE
  ROWNUM<=100;

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |   100 |  2600 |     5   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |              |       |       |            |          |
|   2 |   VIEW                        |              |   102 |  2652 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1           |   102 |  1020 |     5   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0020704 |  1000 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=100)
   4 - access("C1">=1 AND "C1"<=1000)

So, the FIRST_ROWS hinted plan shows that 102 rows will be returned, while the plan with ROWNUM in the WHERE clause shows that 100 of 102 rows will be returned. So, this is the first clue that the two approaches might not be equivalent. Other people have written extensively about what the FIRST_ROWS hint (and OPTIMIZER_MODE) mean and also how ROWNUM in the WHERE clause works (and a potentially significant bug that is present prior to Oracle Database 11.2.0.1). Is it possible to simplify the explanation of the difference, something like this?

The short answer is that the FIRST_ROWS hint tells the query optimizer: I really do not care to know if more than 1, 10, 100, or 1000 rows could be returned by the query, just plan the query execution as if my application will only retrieve 1, 10, 100, or 1000 rows – my application might still retrieve all of the rows, but just plan on the specified number being read.

The ROWNUM predicate in the WHERE clause tells the query optimizer: I really do not care if more than 1, 10, 100, or 1000 rows could be returned by the original query, plan the query execution as if my application will only retrieve 1, 10, 100, or 1000 rows and make certain that my application cannot retrieve more than the specified number of rows.

Simple, maybe just a little too simple?

Let’s put together a little test script to demonstrate. First, we will create a table with a primary key index, insert 500,000 rows, and then collect statistics with a 100% sample size:

CREATE TABLE T1(
  C1 NUMBER,
  C2 NUMBER,
  C3 NUMBER,
  C4 DATE,
  C5 DATE,
  C6 DATE,
  C7 VARCHAR2(20),
  C8 VARCHAR2(20),
  C9 VARCHAR2(20),
  PRIMARY KEY(C1));

INSERT INTO
  T1
SELECT
  ROWNUM,
  DECODE(MOD(ROWNUM,5),0,NULL,ROWNUM),
  DECODE(MOD(ROWNUM,5),0,NULL,ROWNUM),
  TRUNC(SYSDATE)+TRUNC(ROWNUM/100),
  DECODE(MOD(ROWNUM,5),0,NULL,TRUNC(SYSDATE)+TRUNC(ROWNUM/100)),
  DECODE(MOD(ROWNUM,5),0,NULL,TRUNC(SYSDATE)+TRUNC(ROWNUM/100)),
  RPAD(CHR(MOD(ROWNUM-1,26)+65)||
    CHR(MOD(ROWNUM,26)+65)||
    CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65),20,'A'),
  DECODE(MOD(ROWNUM,5),0,NULL,
   RPAD(CHR(MOD(ROWNUM-1,26)+65)||
    CHR(MOD(ROWNUM,26)+65)||
    CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65),20,'A')),
  DECODE(MOD(ROWNUM,5),0,NULL,
   RPAD(CHR(MOD(ROWNUM-1,26)+65)||
    CHR(MOD(ROWNUM,26)+65)||
    CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65),20,'A'))
FROM
  DUAL
CONNECT BY
  LEVEL<=500000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,ESTIMATE_PERCENT=>NULL)

Now the test script:

SPOOL TestFirstRowsAndRownum.txt
ALTER SYSTEM FLUSH BUFFER_CACHE;
SET AUTOTRACE TRACEONLY STATISTICS EXPLAIN
SET ARRAYSIZE 1000
SET TIMING ON
SET LINESIZE 140
SET PAGESIZE 1000
SET TRIMSPOOL ON

SELECT /*+ INDEX(T1) */
  C1,
  C2,
  C3,
  C4,
  C5,
  C6
FROM
  T1
WHERE
  C1 BETWEEN 10000 AND 100000;

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ INDEX(T1) FIRST_ROWS(100) */
  C1,
  C2,
  C3,
  C4,
  C5,
  C6
FROM
  T1
WHERE
  C1 BETWEEN 10000 AND 100000;

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ INDEX(T1) */
  C1,
  C2,
  C3,
  C4,
  C5,
  C6
FROM
  T1
WHERE
  C1 BETWEEN 10000 AND 100000
  AND ROWNUM<=100;

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+  */
  *
FROM
  (SELECT
    C1,
    C2,
    MAX(C3) C3
  FROM
    T1
  GROUP BY
    C1,
    C2
  ORDER BY
    C1,
    C2)
WHERE
  C1 BETWEEN 10000 AND 100000;

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+  FIRST_ROWS(100) */
  *
FROM
  (SELECT
    C1,
    C2,
    MAX(C3) C3
  FROM
    T1
  GROUP BY
    C1,
    C2
  ORDER BY
    C1,
    C2)
WHERE
  C1 BETWEEN 10000 AND 100000;

ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+  */
  *
FROM
  (SELECT
    C1,
    C2,
    MAX(C3) C3
  FROM
    T1
  GROUP BY
    C1,
    C2
  ORDER BY
    C1,
    C2)
WHERE
  C1 BETWEEN 10000 AND 100000
  AND ROWNUM<=100;

SPOOL OFF

The output of the first query, without the hint and without the ROWNUM restriction:

SQL> SELECT /*+ INDEX(T1) */
  2    C1,
  3    C2,
  4    C3,
  5    C4,
  6    C5,
  7    C6
  8  FROM
  9    T1
 10  WHERE
 11    C1 BETWEEN 10000 AND 100000;

90001 rows selected.

Elapsed: 00:00:00.43

Execution Plan
----------------------------------------------------------
Plan hash value: 2459953871

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              | 90002 |  3252K|  1359   (1)| 00:00:07 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           | 90002 |  3252K|  1359   (1)| 00:00:07 |
|*  2 |   INDEX RANGE SCAN          | SYS_C0020704 | 90002 |       |   174   (2)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=10000 AND "C1"<=100000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1534  consistent gets
       1354  physical reads
          0  redo size
    1994612  bytes sent via SQL*Net to client
       1324  bytes received via SQL*Net from client
         92  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90001  rows processed

Note in the plain (unmodified) query that 90,001 rows were returned to SQL*Plus, while the optimizer predicted that 90,002 rows would be returned. It required 0.43 seconds for the rows to be returned to the SQL*Plus session.

Now the second query of the test case, hinted with FIRST_ROWS:

SQL> SELECT /*+ INDEX(T1) FIRST_ROWS(100) */
  2    C1,
  3    C2,
  4    C3,
  5    C4,
  6    C5,
  7    C6
  8  FROM
  9    T1
 10  WHERE
 11    C1 BETWEEN 10000 AND 100000;

90001 rows selected.

Elapsed: 00:00:00.43

Execution Plan
----------------------------------------------------------
Plan hash value: 2459953871

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |   103 |  3811 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |   103 |  3811 |     5   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | SYS_C0020704 | 90002 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=10000 AND "C1"<=100000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1534  consistent gets
       1354  physical reads
          0  redo size
    1994612  bytes sent via SQL*Net to client
       1324  bytes received via SQL*Net from client
         92  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      90001  rows processed

Note in the FIRST_ROWS(100) hinted query that 90,001 rows were returned to SQL*Plus, while the optimizer predicted that 103 rows would be returned due to the hint. It required 0.43 seconds for the rows to be returned to the SQL*Plus session.

Now the third query of the test case:

SQL> SELECT /*+ INDEX(T1) */
  2    C1,
  3    C2,
  4    C3,
  5    C4,
  6    C5,
  7    C6
  8  FROM
  9    T1
 10  WHERE
 11    C1 BETWEEN 10000 AND 100000
 12    AND ROWNUM<=100;

100 rows selected.

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3250543521

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |   100 |  3700 |     5   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY               |              |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1           |   103 |  3811 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0020704 | 90002 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=100)
   3 - access("C1">=10000 AND "C1"<=100000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          6  physical reads
          0  redo size
       2787  bytes sent via SQL*Net to client
        334  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

Note in the query with the ROWNUM predicate in the WHERE clause that 100 rows were returned to SQL*Plus, while the optimizer predicted that 100 rows (103 rows before the COUNT STOPKEY operation) would be returned. It required 0.01 seconds for the rows to be returned to the SQL*Plus session.

Now for the second half of the output (created on Oracle Database 10.2.0.2 – your results may be a bit different). The first query:

SQL> SELECT /*+  */
  2    *
  3  FROM
  4    (SELECT
  5      C1,
  6      C2,
  7      MAX(C3) C3
  8    FROM
  9      T1
 10    GROUP BY
 11      C1,
 12      C2
 13    ORDER BY
 14      C1,
 15      C2)
 16  WHERE
 17    C1 BETWEEN 10000 AND 100000;

90001 rows selected.

Elapsed: 00:00:00.56

Execution Plan
----------------------------------------------------------
Plan hash value: 915463192

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 90002 |  3427K|       |   904   (5)| 00:00:05 |
|   1 |  VIEW               |      | 90002 |  3427K|       |   904   (5)| 00:00:05 |
|   2 |   SORT GROUP BY     |      | 90002 |  2197K|  7080K|   904   (5)| 00:00:05 |
|*  3 |    TABLE ACCESS FULL| T1   | 90002 |  2197K|       |   352   (8)| 00:00:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("C1"<=100000 AND "C1">=10000)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       6563  consistent gets
       6555  physical reads
          0  redo size
    1307436  bytes sent via SQL*Net to client
       1324  bytes received via SQL*Net from client
         92  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      90001  rows processed

Nothing too out of the ordinary in the above, the WHERE clause predicate was pushed into the inline view, although I thought that the inline view (indicated by the VIEW line in the explain plan output) might disappear.

The second query with the FIRST_ROWS hint:

SQL> SELECT /*+  FIRST_ROWS(100) */
  2    *
  3  FROM
  4    (SELECT
  5      C1,
  6      C2,
  7      MAX(C3) C3
  8    FROM
  9      T1
 10    GROUP BY
 11      C1,
 12      C2
 13    ORDER BY
 14      C1,
 15      C2)
 16  WHERE
 17    C1 BETWEEN 10000 AND 100000;

90001 rows selected.

Elapsed: 00:00:00.50

Execution Plan
----------------------------------------------------------
Plan hash value: 915463192

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 90002 |  3427K|       |   904   (5)| 00:00:05 |
|   1 |  VIEW               |      | 90002 |  3427K|       |   904   (5)| 00:00:05 |
|   2 |   SORT GROUP BY     |      | 90002 |  2197K|  7080K|   904   (5)| 00:00:05 |
|*  3 |    TABLE ACCESS FULL| T1   | 90002 |  2197K|       |   352   (8)| 00:00:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("C1"<=100000 AND "C1">=10000)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       6563  consistent gets
       6555  physical reads
          0  redo size
    1307436  bytes sent via SQL*Net to client
       1324  bytes received via SQL*Net from client
         92  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      90001  rows processed

The WHERE clause predicate was pushed into the inline view again, but notice that the optimizer did not predict that 100 or 103 rows would be returned by the query. Do you see the same results on other Oracle Database release versions? It is slightly interesting to see that the available index access path was not selected… almost as if the FIRST_ROWS hint had no effect… of course you know the reason why. 🙂 (Edit: March 10, 2011: The same result was achieved when moving the hint to the inline view, and same result was achieved when specifying the hint as FIRST_ROWS(1). I guess that it is time to review the 10053 trace to see what happened.)

Now the query with the ROWNUM predicate:

SQL> SELECT /*+  */
  2    *
  3  FROM
  4    (SELECT
  5      C1,
  6      C2,
  7      MAX(C3) C3
  8    FROM
  9      T1
 10    GROUP BY
 11      C1,
 12      C2
 13    ORDER BY
 14      C1,
 15      C2)
 16  WHERE
 17    C1 BETWEEN 10000 AND 100000
 18    AND ROWNUM<=100;

100 rows selected.

Elapsed: 00:00:00.31

Execution Plan
----------------------------------------------------------
Plan hash value: 2221909342

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |   100 |  3900 |       |   748   (5)| 00:00:04 |
|*  1 |  COUNT STOPKEY          |      |       |       |       |            |          |
|   2 |   VIEW                  |      | 90002 |  3427K|       |   748   (5)| 00:00:04 |
|*  3 |    SORT GROUP BY STOPKEY|      | 90002 |  1318K|  4968K|   748   (5)| 00:00:04 |
|*  4 |     TABLE ACCESS FULL   | T1   | 90002 |  1318K|       |   352   (8)| 00:00:02 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=100)
   3 - filter(ROWNUM<=100)
   4 - filter("C1"<=100000 AND "C1">=10000)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       6563  consistent gets
       6555  physical reads
          0  redo size
       2275  bytes sent via SQL*Net to client
        334  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
        100  rows processed

The standard WHERE clause predicate used in all of the queries was pushed into the inline view again, and the ROWNUM predicate was also pushed into the inline view. Does this demonstrate the potential bug – what if instead of a full table scan, the primary key index access path were used to read the data from table T1?

Actions

Information

Date : March 10, 2011
Categories : SQL

11 responses

10 03 2011

Charles Hooper (15:33:49) :

If you read my article that described a potentially serious bug when ROWNUM is used in the WHERE clause, you probably recognize another item that the two approaches discussed in this article have in common. The use of ROWNUM in the WHERE clause effectively switches the optimizer to optimize for the specified number of rows to be retrieved as quickly as possible – just as what happens when a FIRST_ROWS(n) hint is specified. I had originally intended to mention this fact in the article.

14 03 2011

ghassem koolivand (03:52:38) :

Hi Charles
This article is great and I have tried to test it in my lab. My oracle version is 11.1.0.6
When I run your example I saw a wonderful result:

********************************************************************************

SELECT /*+  */
  *
FROM
  (SELECT
    C1,
    C2,
    MAX(C3) C3
  FROM
    T1
  GROUP BY
    C1,
    C2
  ORDER BY
    C1,
    C2)
WHERE
  C1 BETWEEN 10000 AND 100000

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       91      0.31       0.78       1440       1354          0       90001
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       93      0.31       0.78       1440       1354          0       90001

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 82  

Rows     Row Source Operation
-------  ---------------------------------------------------
  90001  VIEW  (cr=1354 pr=1440 pw=1440 time=4321 us cost=2004 size=3510078 card=90002)
  90001   SORT GROUP BY (cr=1354 pr=1440 pw=1440 time=1546 us cost=2004 size=2250050 card=90002)
  90001    TABLE ACCESS BY INDEX ROWID T1 (cr=1354 pr=1440 pw=1440 time=14292 us cost=1353 size=2250050 card=90002)
  90001     INDEX RANGE SCAN SYS_C009411 (cr=172 pr=224 pw=224 time=1731 us cost=172 size=0 card=90002)(object id 69857)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      92        0.00          0.00
  SQL*Net message from client                    92        8.11        474.46
  db file scattered read                        180        0.03          0.58
  SQL*Net more data to client                     1        0.00          0.00

The first one is why we are seeing “TABLE ACCESS BY INDEX ROWID” instead “TABLE FULE SCAN” ???
And second one is why we are seeing “db file scattered read” instead “db file sequential read”?
I’m appreciating to help me.
Ghassem

14 03 2011

Charles Hooper (07:47:20) :

If the optimizer parameters and system (CPU) statistics tell the optimizer that index access paths are very inexpensive, you very well could see the execution plan change from a full table scan to an index range scan. Note that you posted TKPROF for the execution which shows the *actual* execution plan, while I posted the autotrace generated explain plan output which may not show the actual execution plan. Note that there is a risk that TKPROF in 11.1.0.6 *could* potentially display the wrong execution plan, if there are multiple execution plans in the trace file for the same SQL statement – see the following article for a demonstration:
https://hoopercharles.wordpress.com/2010/01/11/explain-plan-lies-autotrace-lies-tkprof-lies-what-is-the-plan/

I suggest reviewing the raw trace file to see why db file scattered reads were used during the execution. I suspect that you will find that many of those reads were used to fetch the table blocks (as indicated by the statistics) but could also be used to quickly read adjacent index blocks. See the comments and test case by Timur Akhmadeev in this thread for a demonstration of db file scattered read waits appearing during an index range scan:
http://forums.oracle.com/forums/thread.jspa?threadID=941864&tstart=0

14 03 2011

Narendra (12:35:22) :

Charles,
I believe Timur’s test case was specific to a particular version (10.2.0.3) and it could not be reproduced in different versions in a deterministic manner.
In fact, Timur’s test case was also referenced in another discussion (see http://forums.oracle.com/forums/thread.jspa?messageID=4581053#4581053 )
where he has acknowledged the issue. It makes more sense to say that the read of adjacent index blocks would result in “DB File Parallel Read” and
not “DB File Scattered Read”.

14 03 2011

Charles Hooper (13:37:53) :

Narendra,

The “db file parallel read” wait event will appear when multiple non-adjacent blocks are read during a prefetch operation, see the documentation:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#sthref3022

I think in that article Timur simply stated that he could not reproduce the results on a particular version – I believe that I had the same problem with the test case when I tried to produce the results on Oracle Database 10.2.0.2 Standard Edition. However, that does not mean that a db file scattered read operation cannot be used to read index blocks. You can see an example of that happening roughly half way through this article, as partially shown below:
https://hoopercharles.wordpress.com/2010/11/21/different-performance-from-standard-edition-and-enterprise-edition-2/
Standard Edition (obj# 20275 is the table T1, obj# 20276 is the index):

PARSE #2:c=15600,e=17980,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=2892477391,tim=176265018485
EXEC #2:c=0,e=24,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2892477391,tim=176265018563
WAIT #2: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=0 tim=176265018598
WAIT #2: nam='db file scattered read' ela= 76 file#=8 block#=8320 blocks=8 obj#=20276 tim=176265018735
WAIT #2: nam='db file scattered read' ela= 19381 file#=8 block#=8648 blocks=8 obj#=20276 tim=176265038168
WAIT #2: nam='db file scattered read' ela= 102 file#=8 block#=8344 blocks=8 obj#=20276 tim=176265038360
WAIT #2: nam='db file scattered read' ela= 98 file#=7 block#=3200 blocks=8 obj#=20275 tim=176265038517

15 03 2011

Timur Akhmadeev (03:26:02) :

The “db file parallel read” wait event will appear when multiple non-adjacent blocks are read during a prefetch operation

Although I’m not Narendra, I think by “adjacent index blocks” he means “the blocks which are logically adjacent in the index structure”. So, they are read in parallel by process – it simply picks up ROWIDs of index blocks from a branch block and issues one request to access them all (maybe not all, I don’t know in-depth details).

15 03 2011

Narendra (05:01:11) :

Charles,

I did not intend to say it never happens. My point was about whether it is deterministic enough to make a generic statement.
I know very little about how CBO works and things change with each oracle version. During the OTN forum discussion that I had mentioned above,
I was not able to reproduce the behaviour with a couple of oracle versions. I will certainly see if I can reproduce the results with your test case.
BTW, I might have missed it but the documentation link that you have provided does not appear to make any reference to blocks being adjacent or not.

15 03 2011

Narendra (05:10:41) :

Timur,

You remind me once again about the importance of making a complete (and correct) statement. Thanks for that.
I think I wanted to say that when a table is accessed using an index range scan, the conventional wisdom said table blocks will
be read one-block-at-a-time (and hence might result in “DB File Sequential Read” waits). I think from 9i onwards, CBO was enhanced
to perform multiple single-block reads for a table (resulting from indexed range scan) in parallel, instead of accessing single (table) blocks (adjacent or not)
in a serial manner. I think Tom (as always) explains it nicely here

15 03 2011

Charles Hooper (06:09:33) :

Timur,
Thank you for pointing out “logical” and “physical” regarding the term “adjacent”.

Narendra,
Shortly after I posted my comment I thought about adding a couple of clarifications – my comment above might appear to be a little too harsh, essentially stating: “It happens this way, end of discussion.” There is a lot that I still do not understand about Oracle Database, so it is not my intention to say “End of discussion.” With my test case that I linked to above, you may see different results when testing different Oracle Database release versions or Standard/Enterprise edition (I saw db file sequential read waits on 10.2.0.2 Standard Edition).

I agree with you that the documentation link that I provided does not distinguish between adjacent and non-adjacent blocks when describing the “db file parallel read” wait event. Just because I said that something is true (in part 1 of the article series linked to in my previous comment) does not necessarily mean that it is true. However, a search of the Internet finds some agreement with my suggestion (disclaimer: I might have learned of this fact from this article), take a look at the article and the comments attached: http://jonathanlewis.wordpress.com/2006/12/15/index-operations/

15 03 2011

Narendra (07:27:27) :

Charles,

Two things about the Jonathan’s blog post that you have mentioned
1) It is dated 15th December 2006. Lots of things might have changed since (but it may still be valid). Just trying to read things in context of the time (as Jonathan himself says)
2) In the 2nd comment, Jonathan himself claims that it is his assumption, which was not proved then
As I said earlier, it was more of a “conventional wisdom” to say that Table Access By index range scan would (typically) cause “DB File Sequential Reads” waits. There would
always be cases where “DB File Scattered Read” waits would be observed but my understanding is it would be safe to treat such cases as exceptions (handled internally by Oracle) rather than rules.
I think I have managed to significantly divert the discussion from the topic of your original blog post so apologies for that.

16 03 2011

Charles Hooper (04:35:04) :

Narendra,

It is true that the comments in this article have drifted a bit from the original content of the article, but it has been an interesting conversation. In case it is not obvious to the casual reader of this blog, I think that I should point out that you have made excellent points in your comments. It would be interesting to see any insight that Jonathan or someone else could provide regarding the frequency of “db file scattered read” waits during index range scan operations – in my experience those operations appear to be infrequently used in the older Oracle Database release versions, but seem to be more common in more recent releases.

	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Huge Presentation – Work…
	Dan on Huge Presentation – Work…
	Charles Hooper on Review of Synology DS415+, How…
	Matthias Clausen on Review of Synology DS415+, How…
	Charles Hooper on _SMALL_TABLE_THRESHOLD Paramet…
	mohd atif on _SMALL_TABLE_THRESHOLD Paramet…
	Charles Hooper on Free ANSI SQL to Oracle Specif…
	Kalin Agrawal on Free ANSI SQL to Oracle Specif…
	Charles Hooper on Large Page Support or Not…
	vaurob on Large Page Support or Not…
	Locating Oracle… on Finding a New Home for a Clien…

Charles Hooper's Oracle Notes