Using ROWNUM in the Where Clause Causes Problems

4 04 2011

April 4, 2011

A couple of years ago a very well written article appeared in Oracle Magazine that described how to use ROWNUM in the WHERE clause, and also why using ROWNUM might not work quite as expected.

Let’s assume that we want to do something a bit unusual, as described in this forum thread. We set up the tables for the test case to see what is happening:

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  TRUNC(SYSDATE+ROWNUM) C2,
  LPAD('A',15,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=10000
ORDER BY
  DBMS_RANDOM.VALUE;

CREATE TABLE T2 AS
SELECT
  ROWNUM C1,
  TRUNC(SYSDATE+ROWNUM) C2,
  LPAD('A',15,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=10000
ORDER BY
  DBMS_RANDOM.VALUE;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2')

In the above, 10,000 rows were inserted into the T1 and T2 tables in random order, as the tables were created. Trying a basic query, if we execute the following query, we should see six randomly selected rows:

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (6,5,4,3,2,1);

  C1 C2        PADDING
---- --------- ---------------
9546 23-MAY-37 AAAAAAAAAAAAAAA
4894 27-AUG-24 AAAAAAAAAAAAAAA
1106 14-APR-14 AAAAAAAAAAAAAAA
1144 22-MAY-14 AAAAAAAAAAAAAAA
2281 02-JUL-17 AAAAAAAAAAAAAAA
4832 26-JUN-24 AAAAAAAAAAAAAAA

6 rows selected.

And, if we had not read the article that is linked to at the start of this blog article, we might assume that the following would also return six rows:

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (7,6,5,4,3,2); 

no rows selected

How about something like the following? If there were only 10 rows in table T2, we might expect the following to return 10 rows from table T1 (a random selection of rows), or 10,000 rows if both tables T1 and T2 contain 10,000 rows:

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (
    SELECT
      ROWNUM RN
    FROM
      T2);

no rows selected

Interesting (and expected if you read the article that is linked to at the start of this blog article). Let’s take a look at the execution plan:

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  *
FROM
  T1
WHERE
  ROWNUM IN (
    SELECT
      ROWNUM RN
    FROM
      T2);

Execution Plan
----------------------------------------------------------
Plan hash value: 881956856

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |   100M|  2574M|     8   (0)| 00:00:01 |
|   1 |  COUNT                |      |       |       |            |          |
|*  2 |   FILTER              |      |       |       |            |          |
|   3 |    TABLE ACCESS FULL  | T1   | 10000 |   263K|     6   (0)| 00:00:01 |
|*  4 |    FILTER             |      |       |       |            |          |
|   5 |     COUNT             |      |       |       |            |          |
|   6 |      TABLE ACCESS FULL| T2   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( EXISTS (???)
   4 - filter(ROWNUM=ROWNUM)

Interesting, after applying the EXISTS (???) filter (plan ID line 2), the optimizer expects 10,000 rows to become 100,000,000 rows. Secondly, ff the value of ROWNUM is assigned only after the WHERE clause is evaluated, could the ROWNUM=ROWNUM filter (plan ID line 4) ever be true (and result in a row passing through)? I suspect that is why this query returns no rows – is there a better answer?

What about this query, which should yield the originally expected results (note that AUTOTRACE was still enabled):

SELECT
  T1.*
FROM
  (SELECT
     T1.*,
     ROWNUM RN
   FROM
     T1) T1,
  (SELECT
     T2.*,
     ROWNUM RN
   FROM
     T2) T2
WHERE
  T1.RN=T2.RN;

Execution Plan
----------------------------------------------------------
Plan hash value: 2458726244

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|    54M|    17  (30)| 00:00:01 |
|*  1 |  HASH JOIN           |      |  1000K|    54M|    17  (30)| 00:00:01 |
|   2 |   VIEW               |      | 10000 |   126K|     6   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| T2   | 10000 |       |     6   (0)| 00:00:01 |
|   5 |   VIEW               |      | 10000 |   429K|     6   (0)| 00:00:01 |
|   6 |    COUNT             |      |       |       |            |          |
|   7 |     TABLE ACCESS FULL| T1   | 10000 |   263K|     6   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."RN"="T2"."RN")

The optimizer is predicting that 1,000,000 rows will be returned, and the aliased ROWNUM values (“T1″.”RN”=”T2″.”RN”) are being compared. So, will this query return 0 rows, 10,000 rows, or 1,000,000 rows? Let’s find out:

SET AUTOTRACE OFF

/

        C1 C2        PADDING                 RN
---------- --------- --------------- ----------
...
      8353 15-FEB-34 AAAAAAAAAAAAAAA       9989
      8566 16-SEP-34 AAAAAAAAAAAAAAA       9990
      8202 17-SEP-33 AAAAAAAAAAAAAAA       9991
      7980 07-FEB-33 AAAAAAAAAAAAAAA       9992
      6708 15-AUG-29 AAAAAAAAAAAAAAA       9993
      5848 08-APR-27 AAAAAAAAAAAAAAA       9994
      9006 30-NOV-35 AAAAAAAAAAAAAAA       9995
      6423 03-NOV-28 AAAAAAAAAAAAAAA       9996
      3272 19-MAR-20 AAAAAAAAAAAAAAA       9997
       921 11-OCT-13 AAAAAAAAAAAAAAA       9998
      7519 04-NOV-31 AAAAAAAAAAAAAAA       9999
      9311 30-SEP-36 AAAAAAAAAAAAAAA      10000

10000 rows selected.

So, from the above, we find that by placing both halves of the query into inline views, and joining the aliased ROWNUM values, the query worked as expected.

Let’s try just placing what had been an IN list sub-query in the original query into an inline view to see what happens:

SELECT
  *
FROM
  T1,
  (SELECT
     T2.*,
     ROWNUM RN
   FROM
     T2) T2
WHERE
  T1.ROWNUM=T2.RN;

SQL> SELECT
  2    *
  3  FROM
  4    T1,
  5    (SELECT
  6       T2.*,
  7       ROWNUM RN
  8     FROM
  9       T2) T2
 10  WHERE
 11    T1.ROWNUM=T2.RN;
  T1.ROWNUM=T2.RN
     *
ERROR at line 11:
ORA-01747: invalid user.table.column, table.column, or column specification

Needless to say, that did not work. One final thought: ROWNUM is not a fixed attribute of a specific row in a table – if you attempt to use it as such, you will not achieve the results that you are expecting.

Actions

Information

Date : April 4, 2011
Categories : SQL

5 responses

4 04 2011

Tanel Poder (08:33:35) :

Yeah, ROWNUM is just a run-time variable, incremented as rows pass through the COUNT STOPKEY row source. Interestingly it’s just an UB4 datatype, so the max rownum is around 4 billion (2^32).

So, queries having WHERE ROWNUM < 5 000 000 000 would actually return wrong results (only 4 Billion rows are returned, then the COUNT STOPKEY rowsource silently bails out):

http://blog.tanelpoder.com/2010/10/25/count-stopkey-operation-the-where-rownum/

Not that this is relevant to everyday operations, but I discovered this when moving tens of billions of rows onto an Exadata cluster…

4 04 2011

Charles Hooper (12:37:16) :

Interesting – thanks for providing the link. Do you think that the number wrapped around 0 once it hit 4294967296-1 (multiple rows with ROWNUM values of 1, 2, 3, etc.)?

Partially related old news to this article, ROWNUM can lead to poor execution plans in Oracle Database release versions prior to 11.2.0.1:
https://hoopercharles.wordpress.com/2009/12/09/sql-%e2%80%93-bad-execution-plan-caused-by-rownum-row_number-is-possible-fix/

5 04 2011

Pavan Kumar N (01:00:32) :

Hi charles,

It was very nice interesting concept/demo you have put forward to discuss. I have gone through the demo which you stated across, perhaps in the example you have utilized across the “EXISTS” operator across T1 and T2 and results in zero rows. If check basic thing, rownum does assigns the rowvalue numeric value for the rows which get satisfies the “where” criteria of segment and it that satisfies then results into number assigned to Rownum. Looking into that concept the rownum=rownum does results into zero rows (Optimizer does have statistics and expected to scan 10000 rows and it does – good thing we need to understand the where clause or filter predicate on which rownum purely depends on). If you consider the other test cases of In and inline view, the columns/rows apart from rownum does return and satisfies the criteria.
let me know your comments.. too !!

5 04 2011

Charles Hooper (08:22:10) :

A couple of possibly interesting items. The test case for this blog article was created using Oracle Database 10.2.0.2. A 10053 trace file showed that the final query after transformations looked like the following (additional spaces and CRLF added to help readability):

SELECT /*+ */
  "T1"."C1" "C1",
  "T1"."C2" "C2",
  "T1"."PADDING" "PADDING"
FROM
  (SELECT /*+ */ DISTINCT
     ROWNUM "RN"
   FROM
     "TESTUSER"."T2" "T2") "VW_NSO_2",
  "TESTUSER"."T1" "T1"
WHERE
  "VW_NSO_2"."RN"=ROWNUM

I suspect that the above is failing because ROWNUM at that point should have been undefined (similar to NULL). Also interesting is to see that the execution plan generated by a DBMS_XPLAN, pulling the execution plan from memory, is a bit different from the output of AUTOTRACE:

SQL_ID  4kk2gag8fs4jt, child number 0
-------------------------------------
SELECT T1.* FROM T1 WHERE     ROWNUM IN (      SELECT        ROWNUM RN      
FROM        T2)
 
Plan hash value: 881956856
 
----------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   1 |  COUNT                |      |      1 |        |      0 |00:00:00.03 |      49 |
|*  2 |   FILTER              |      |      1 |        |      0 |00:00:00.03 |      49 |
|   3 |    TABLE ACCESS FULL  | T1   |      0 |  10000 |      0 |00:00:00.01 |       0 |
|*  4 |    FILTER             |      |      1 |        |      0 |00:00:00.03 |      49 |
|   5 |     COUNT             |      |      1 |        |  10000 |00:00:00.02 |      49 |
|   6 |      TABLE ACCESS FULL| T2   |      1 |      1 |  10000 |00:00:00.01 |      49 |
----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( IS NOT NULL)
   4 - filter(ROWNUM=ROWNUM)

Notice that the filter predicate for plan ID 2 no longer shows “2 – filter( EXISTS (???)”, but instead shows “2 – filter( IS NOT NULL)”, which I believe means that no rows can be returned. Adding a NO_QUERY_TRANSFORMATION hint does not alter the generated execution plan.

Now, repeat the test on Oracle Database 11.1.0.7. AUTOTRACE outputs an execution plan that looks like this:

Execution Plan
----------------------------------------------------------
Plan hash value: 881956856

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |   100M|  2574M|    10   (0)| 00:00:01 |
|   1 |  COUNT                |      |       |       |            |          |
|*  2 |   FILTER              |      |       |       |            |          |
|   3 |    TABLE ACCESS FULL  | T1   | 10000 |   263K|     8   (0)| 00:00:01 |
|*  4 |    FILTER             |      |       |       |            |          |
|   5 |     COUNT             |      |       |       |            |          |
|   6 |      TABLE ACCESS FULL| T2   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( EXISTS (<not feasible>)
   4 - filter(ROWNUM=ROWNUM)

The filter predicate “2 – filter( EXISTS (???)” that had appeared in the output from 10.2.0.2 changed to “2 – filter( EXISTS ()” . The DBMS_XPLAN output looks like this on 11.1.0.7 (same as 10.2.0.2):

SQL_ID  6ywscwr5ambz0, child number 0
-------------------------------------
SELECT /*+ */ T1.* FROM T1 WHERE     ROWNUM IN (      SELECT        
ROWNUM RN      FROM        T2)
 
Plan hash value: 881956856
 
----------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |      0 |00:00:00.01 |      49 |
|   1 |  COUNT                |      |      1 |        |      0 |00:00:00.01 |      49 |
|*  2 |   FILTER              |      |      1 |        |      0 |00:00:00.01 |      49 |
|   3 |    TABLE ACCESS FULL  | T1   |      0 |  10000 |      0 |00:00:00.01 |       0 |
|*  4 |    FILTER             |      |      1 |        |      0 |00:00:00.01 |      49 |
|   5 |     COUNT             |      |      1 |        |  10000 |00:00:00.01 |      49 |
|   6 |      TABLE ACCESS FULL| T2   |      1 |      1 |  10000 |00:00:00.01 |      49 |
----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( IS NOT NULL)
   4 - filter(ROWNUM=ROWNUM)

I think that we have to keep in mind that ROWNUM does not have a value (is NULL?) until the result row source is generated, and any equality comparison with a NULL is false.

If someone else has a better explanation – please feel free to post a comment.

Edit: I just noticed that the <not feasible> did not display in the filter predicates.

5 04 2011

Pavan kumar N (13:48:23) :

Hi Charles…

when I executed the query in 11.2.0.1.0 the query returning rows..

SQL> SELECT /*+ */
  2    "T1"."C1" "C1",
  3    "T1"."C2" "C2",
  4    "T1"."PADDING" "PADDING"
  5  FROM
  6    (SELECT /*+ */ DISTINCT
  7       ROWNUM "RN"
  8     FROM
  9        "T2") "VW_NSO_2",
 10     "T1"
 11  WHERE
 12    "VW_NSO_2"."RN"=ROWNUM;

Execution Plan
----------------------------------------------------------
Plan hash value: 1330072140

--------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |   100M|  4196M|   125K  (1)| 00:25:09 |
|   1 |  COUNT                  |      |       |       |            |          |
|*  2 |   FILTER                |      |       |       |            |          |
|   3 |    MERGE JOIN CARTESIAN |      |   100M|  4196M|   125K  (1)| 00:25:09 |
|   4 |     VIEW                |      | 10000 |   126K|    15   (7)| 00:00:01 |
|   5 |      HASH UNIQUE        |      | 10000 |       |    15   (7)| 00:00:01 |
|   6 |       COUNT             |      |       |       |            |          |
|   7 |        TABLE ACCESS FULL| T2   | 10000 |       |    14   (0)| 00:00:01 |
|   8 |     BUFFER SORT         |      | 10000 |   302K|   125K  (1)| 00:25:09 |
|   9 |      TABLE ACCESS FULL  | T1   | 10000 |   302K|    13   (8)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("VW_NSO_2"."RN"=ROWNUM)

Note
-----
   - dynamic sampling used for this statement (level=2)

I am rechecking the test case… i will update you soon

	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Huge Presentation – Work…
	Dan on Huge Presentation – Work…
	Charles Hooper on Review of Synology DS415+, How…
	Matthias Clausen on Review of Synology DS415+, How…
	Charles Hooper on _SMALL_TABLE_THRESHOLD Paramet…
	mohd atif on _SMALL_TABLE_THRESHOLD Paramet…
	Charles Hooper on Free ANSI SQL to Oracle Specif…
	Kalin Agrawal on Free ANSI SQL to Oracle Specif…
	Charles Hooper on Large Page Support or Not…
	vaurob on Large Page Support or Not…
	Locating Oracle… on Finding a New Home for a Clien…

Charles Hooper's Oracle Notes