Using ROWNUM in the Where Clause Causes Problems

4 04 2011

April 4, 2011

A couple of years ago a very well written article appeared in Oracle Magazine that described how to use ROWNUM in the WHERE clause, and also why using ROWNUM might not work quite as expected.

Let’s assume that we want to do something a bit unusual, as described in this forum thread.  We set up the tables for the test case to see what is happening:

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  TRUNC(SYSDATE+ROWNUM) C2,
  LPAD('A',15,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=10000
ORDER BY
  DBMS_RANDOM.VALUE;

CREATE TABLE T2 AS
SELECT
  ROWNUM C1,
  TRUNC(SYSDATE+ROWNUM) C2,
  LPAD('A',15,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=10000
ORDER BY
  DBMS_RANDOM.VALUE;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2') 

In the above, 10,000 rows were inserted into the T1 and T2 tables in random order, as the tables were created.  Trying a basic query, if we execute the following query, we should see six randomly selected rows:

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (6,5,4,3,2,1);

  C1 C2        PADDING
---- --------- ---------------
9546 23-MAY-37 AAAAAAAAAAAAAAA
4894 27-AUG-24 AAAAAAAAAAAAAAA
1106 14-APR-14 AAAAAAAAAAAAAAA
1144 22-MAY-14 AAAAAAAAAAAAAAA
2281 02-JUL-17 AAAAAAAAAAAAAAA
4832 26-JUN-24 AAAAAAAAAAAAAAA

6 rows selected. 

And, if we had not read the article that is linked to at the start of this blog article, we might assume that the following would also return six rows:

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (7,6,5,4,3,2); 

no rows selected

How about something like the following?  If there were only 10 rows in table T2, we might expect the following to return 10 rows from table T1 (a random selection of rows), or 10,000 rows if both tables T1 and T2 contain 10,000 rows:

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (
    SELECT
      ROWNUM RN
    FROM
      T2);

no rows selected 

Interesting (and expected if you read the article that is linked to at the start of this blog article).  Let’s take a look at the execution plan:

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (
    SELECT
      ROWNUM RN
    FROM
      T2);

Execution Plan
----------------------------------------------------------
Plan hash value: 881956856

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |   100M|  2574M|     8   (0)| 00:00:01 |
|   1 |  COUNT                |      |       |       |            |          |
|*  2 |   FILTER              |      |       |       |            |          |
|   3 |    TABLE ACCESS FULL  | T1   | 10000 |   263K|     6   (0)| 00:00:01 |
|*  4 |    FILTER             |      |       |       |            |          |
|   5 |     COUNT             |      |       |       |            |          |
|   6 |      TABLE ACCESS FULL| T2   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( EXISTS (???)
   4 - filter(ROWNUM=ROWNUM) 

Interesting, after applying the EXISTS (???) filter (plan ID line 2), the optimizer expects 10,000 rows to become 100,000,000 rows.  Secondly, ff the value of ROWNUM is assigned only after the WHERE clause is evaluated, could the ROWNUM=ROWNUM filter (plan ID line 4) ever be true (and result in a row passing through)?  I suspect that is why this query returns no rows – is there a better answer?

What about this query, which should yield the originally expected results (note that AUTOTRACE was still enabled):

SELECT
  T1.*
FROM
  (SELECT
     T1.*,
     ROWNUM RN
   FROM
     T1) T1,
  (SELECT
     T2.*,
     ROWNUM RN
   FROM
     T2) T2
WHERE
  T1.RN=T2.RN;

Execution Plan
----------------------------------------------------------
Plan hash value: 2458726244

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|    54M|    17  (30)| 00:00:01 |
|*  1 |  HASH JOIN           |      |  1000K|    54M|    17  (30)| 00:00:01 |
|   2 |   VIEW               |      | 10000 |   126K|     6   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| T2   | 10000 |       |     6   (0)| 00:00:01 |
|   5 |   VIEW               |      | 10000 |   429K|     6   (0)| 00:00:01 |
|   6 |    COUNT             |      |       |       |            |          |
|   7 |     TABLE ACCESS FULL| T1   | 10000 |   263K|     6   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."RN"="T2"."RN") 

The optimizer is predicting that 1,000,000 rows will be returned, and the aliased ROWNUM values (“T1″.”RN”=”T2″.”RN”) are being compared.  So, will this query return 0 rows, 10,000 rows, or 1,000,000 rows?  Let’s find out:

SET AUTOTRACE OFF

/

        C1 C2        PADDING                 RN
---------- --------- --------------- ----------
...
      8353 15-FEB-34 AAAAAAAAAAAAAAA       9989
      8566 16-SEP-34 AAAAAAAAAAAAAAA       9990
      8202 17-SEP-33 AAAAAAAAAAAAAAA       9991
      7980 07-FEB-33 AAAAAAAAAAAAAAA       9992
      6708 15-AUG-29 AAAAAAAAAAAAAAA       9993
      5848 08-APR-27 AAAAAAAAAAAAAAA       9994
      9006 30-NOV-35 AAAAAAAAAAAAAAA       9995
      6423 03-NOV-28 AAAAAAAAAAAAAAA       9996
      3272 19-MAR-20 AAAAAAAAAAAAAAA       9997
       921 11-OCT-13 AAAAAAAAAAAAAAA       9998
      7519 04-NOV-31 AAAAAAAAAAAAAAA       9999
      9311 30-SEP-36 AAAAAAAAAAAAAAA      10000

10000 rows selected.

So, from the above, we find that by placing both halves of the query into inline views, and joining the aliased ROWNUM values, the query worked as expected.

Let’s try just placing what had been an IN list sub-query in the original query into an inline view to see what happens:

SELECT
  *
FROM
  T1,
  (SELECT
     T2.*,
     ROWNUM RN
   FROM
     T2) T2
WHERE
  T1.ROWNUM=T2.RN;

SQL> SELECT
  2    *
  3  FROM
  4    T1,
  5    (SELECT
  6       T2.*,
  7       ROWNUM RN
  8     FROM
  9       T2) T2
 10  WHERE
 11    T1.ROWNUM=T2.RN;
  T1.ROWNUM=T2.RN
     *
ERROR at line 11:
ORA-01747: invalid user.table.column, table.column, or column specification 

Needless to say, that did not work.  One final thought: ROWNUM is not a fixed attribute of a specific row in a table – if you attempt to use it as such, you will not achieve the results that you are expecting.








Follow

Get every new post delivered to your Inbox.

Join 139 other followers