April 4, 2011
A couple of years ago a very well written article appeared in Oracle Magazine that described how to use ROWNUM in the WHERE clause, and also why using ROWNUM might not work quite as expected.
Let’s assume that we want to do something a bit unusual, as described in this forum thread. We set up the tables for the test case to see what is happening:
CREATE TABLE T1 AS SELECT ROWNUM C1, TRUNC(SYSDATE+ROWNUM) C2, LPAD('A',15,'A') PADDING FROM DUAL CONNECT BY LEVEL<=10000 ORDER BY DBMS_RANDOM.VALUE; CREATE TABLE T2 AS SELECT ROWNUM C1, TRUNC(SYSDATE+ROWNUM) C2, LPAD('A',15,'A') PADDING FROM DUAL CONNECT BY LEVEL<=10000 ORDER BY DBMS_RANDOM.VALUE; EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1') EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2')
In the above, 10,000 rows were inserted into the T1 and T2 tables in random order, as the tables were created. Trying a basic query, if we execute the following query, we should see six randomly selected rows:
SELECT * FROM T1 WHERE ROWNUM IN (6,5,4,3,2,1); C1 C2 PADDING ---- --------- --------------- 9546 23-MAY-37 AAAAAAAAAAAAAAA 4894 27-AUG-24 AAAAAAAAAAAAAAA 1106 14-APR-14 AAAAAAAAAAAAAAA 1144 22-MAY-14 AAAAAAAAAAAAAAA 2281 02-JUL-17 AAAAAAAAAAAAAAA 4832 26-JUN-24 AAAAAAAAAAAAAAA 6 rows selected.
And, if we had not read the article that is linked to at the start of this blog article, we might assume that the following would also return six rows:
SELECT * FROM T1 WHERE ROWNUM IN (7,6,5,4,3,2); no rows selected
How about something like the following? If there were only 10 rows in table T2, we might expect the following to return 10 rows from table T1 (a random selection of rows), or 10,000 rows if both tables T1 and T2 contain 10,000 rows:
SELECT * FROM T1 WHERE ROWNUM IN ( SELECT ROWNUM RN FROM T2); no rows selected
Interesting (and expected if you read the article that is linked to at the start of this blog article). Let’s take a look at the execution plan:
SET AUTOTRACE TRACEONLY EXPLAIN SELECT * FROM T1 WHERE ROWNUM IN ( SELECT ROWNUM RN FROM T2); Execution Plan ---------------------------------------------------------- Plan hash value: 881956856 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 100M| 2574M| 8 (0)| 00:00:01 | | 1 | COUNT | | | | | | |* 2 | FILTER | | | | | | | 3 | TABLE ACCESS FULL | T1 | 10000 | 263K| 6 (0)| 00:00:01 | |* 4 | FILTER | | | | | | | 5 | COUNT | | | | | | | 6 | TABLE ACCESS FULL| T2 | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter( EXISTS (???) 4 - filter(ROWNUM=ROWNUM)
Interesting, after applying the EXISTS (???) filter (plan ID line 2), the optimizer expects 10,000 rows to become 100,000,000 rows. Secondly, ff the value of ROWNUM is assigned only after the WHERE clause is evaluated, could the ROWNUM=ROWNUM filter (plan ID line 4) ever be true (and result in a row passing through)? I suspect that is why this query returns no rows – is there a better answer?
What about this query, which should yield the originally expected results (note that AUTOTRACE was still enabled):
SELECT T1.* FROM (SELECT T1.*, ROWNUM RN FROM T1) T1, (SELECT T2.*, ROWNUM RN FROM T2) T2 WHERE T1.RN=T2.RN; Execution Plan ---------------------------------------------------------- Plan hash value: 2458726244 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 54M| 17 (30)| 00:00:01 | |* 1 | HASH JOIN | | 1000K| 54M| 17 (30)| 00:00:01 | | 2 | VIEW | | 10000 | 126K| 6 (0)| 00:00:01 | | 3 | COUNT | | | | | | | 4 | TABLE ACCESS FULL| T2 | 10000 | | 6 (0)| 00:00:01 | | 5 | VIEW | | 10000 | 429K| 6 (0)| 00:00:01 | | 6 | COUNT | | | | | | | 7 | TABLE ACCESS FULL| T1 | 10000 | 263K| 6 (0)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T1"."RN"="T2"."RN")
The optimizer is predicting that 1,000,000 rows will be returned, and the aliased ROWNUM values (“T1″.”RN”=”T2″.”RN”) are being compared. So, will this query return 0 rows, 10,000 rows, or 1,000,000 rows? Let’s find out:
SET AUTOTRACE OFF / C1 C2 PADDING RN ---------- --------- --------------- ---------- ... 8353 15-FEB-34 AAAAAAAAAAAAAAA 9989 8566 16-SEP-34 AAAAAAAAAAAAAAA 9990 8202 17-SEP-33 AAAAAAAAAAAAAAA 9991 7980 07-FEB-33 AAAAAAAAAAAAAAA 9992 6708 15-AUG-29 AAAAAAAAAAAAAAA 9993 5848 08-APR-27 AAAAAAAAAAAAAAA 9994 9006 30-NOV-35 AAAAAAAAAAAAAAA 9995 6423 03-NOV-28 AAAAAAAAAAAAAAA 9996 3272 19-MAR-20 AAAAAAAAAAAAAAA 9997 921 11-OCT-13 AAAAAAAAAAAAAAA 9998 7519 04-NOV-31 AAAAAAAAAAAAAAA 9999 9311 30-SEP-36 AAAAAAAAAAAAAAA 10000 10000 rows selected.
So, from the above, we find that by placing both halves of the query into inline views, and joining the aliased ROWNUM values, the query worked as expected.
Let’s try just placing what had been an IN list sub-query in the original query into an inline view to see what happens:
SELECT * FROM T1, (SELECT T2.*, ROWNUM RN FROM T2) T2 WHERE T1.ROWNUM=T2.RN; SQL> SELECT 2 * 3 FROM 4 T1, 5 (SELECT 6 T2.*, 7 ROWNUM RN 8 FROM 9 T2) T2 10 WHERE 11 T1.ROWNUM=T2.RN; T1.ROWNUM=T2.RN * ERROR at line 11: ORA-01747: invalid user.table.column, table.column, or column specification
Needless to say, that did not work. One final thought: ROWNUM is not a fixed attribute of a specific row in a table – if you attempt to use it as such, you will not achieve the results that you are expecting.
Yeah, ROWNUM is just a run-time variable, incremented as rows pass through the COUNT STOPKEY row source. Interestingly it’s just an UB4 datatype, so the max rownum is around 4 billion (2^32).
So, queries having WHERE ROWNUM < 5 000 000 000 would actually return wrong results (only 4 Billion rows are returned, then the COUNT STOPKEY rowsource silently bails out):
http://blog.tanelpoder.com/2010/10/25/count-stopkey-operation-the-where-rownum/
Not that this is relevant to everyday operations, but I discovered this when moving tens of billions of rows onto an Exadata cluster…
Interesting – thanks for providing the link. Do you think that the number wrapped around 0 once it hit 4294967296-1 (multiple rows with ROWNUM values of 1, 2, 3, etc.)?
Partially related old news to this article, ROWNUM can lead to poor execution plans in Oracle Database release versions prior to 11.2.0.1:
https://hoopercharles.wordpress.com/2009/12/09/sql-%e2%80%93-bad-execution-plan-caused-by-rownum-row_number-is-possible-fix/
Hi charles,
It was very nice interesting concept/demo you have put forward to discuss. I have gone through the demo which you stated across, perhaps in the example you have utilized across the “EXISTS” operator across T1 and T2 and results in zero rows. If check basic thing, rownum does assigns the rowvalue numeric value for the rows which get satisfies the “where” criteria of segment and it that satisfies then results into number assigned to Rownum. Looking into that concept the rownum=rownum does results into zero rows (Optimizer does have statistics and expected to scan 10000 rows and it does – good thing we need to understand the where clause or filter predicate on which rownum purely depends on). If you consider the other test cases of In and inline view, the columns/rows apart from rownum does return and satisfies the criteria.
let me know your comments.. too !!
A couple of possibly interesting items. The test case for this blog article was created using Oracle Database 10.2.0.2. A 10053 trace file showed that the final query after transformations looked like the following (additional spaces and CRLF added to help readability):
I suspect that the above is failing because ROWNUM at that point should have been undefined (similar to NULL). Also interesting is to see that the execution plan generated by a DBMS_XPLAN, pulling the execution plan from memory, is a bit different from the output of AUTOTRACE:
Notice that the filter predicate for plan ID 2 no longer shows “2 – filter( EXISTS (???)”, but instead shows “2 – filter( IS NOT NULL)”, which I believe means that no rows can be returned. Adding a NO_QUERY_TRANSFORMATION hint does not alter the generated execution plan.
Now, repeat the test on Oracle Database 11.1.0.7. AUTOTRACE outputs an execution plan that looks like this:
The filter predicate “2 – filter( EXISTS (???)” that had appeared in the output from 10.2.0.2 changed to “2 – filter( EXISTS ()” . The DBMS_XPLAN output looks like this on 11.1.0.7 (same as 10.2.0.2):
I think that we have to keep in mind that ROWNUM does not have a value (is NULL?) until the result row source is generated, and any equality comparison with a NULL is false.
If someone else has a better explanation – please feel free to post a comment.
Edit: I just noticed that the <not feasible> did not display in the filter predicates.
Hi Charles…
when I executed the query in 11.2.0.1.0 the query returning rows..
I am rechecking the test case… i will update you soon