Improving Performance by Using a Cartesian Join 2

18 05 2010

May 18, 2010

(Back to the Previous Post in the Series)

For a while I have wondered why Cartesian joins might happen when all necessary join conditions are provided in a SQL statement.  Sure, an explain plan will show that a Cartesian join is used when Oracle’s optimizer believes that one of the row sources will return a single row, as seen in this example from a discussion forum:

SELECT
  TBLPERSON.SNAME,
  TBLPERSON.FNAME,
  TO_CHAR(TBLWARDSHIFTS.SHIFTSTART,'HH24:MI') AS REQSTART,
  TO_CHAR(TBLWARDSHIFTS.SHIFTEND,'HH24:MI') AS REQEND,
  TBLREQUIREMENTS.RDATE
FROM
  TBLAVAILABILITY,
  TBLREQUIREMENTS,
  TBLWARDSHIFTS,
  TBLPERSON
WHERE
  TBLAVAILABILITY.ADATE = TBLREQUIREMENTS.RDATE
  AND TBLAVAILABILITY.PERSONID = TBLPERSON.PERSONID
  AND TBLREQUIREMENTS.WARDSHIFTID = TBLWARDSHIFTS.WARDSHIFTID
  AND TO_CHAR(TBLREQUIREMENTS.RDATE,'MM')=TO_CHAR(SYSDATE,'MM')
  AND TO_NUMBER(TO_CHAR(TBLWARDSHIFTS.SHIFTSTART,'HH24'))>=15
  AND TO_NUMBER(TO_CHAR(TBLWARDSHIFTS.SHIFTSTART,'HH24'))<20
  AND TBLAVAILABILITY.ANYLATE=1;

---------------------------------------------------------------------------------------------------------
| Id | Operation                      | Name            | Rows  | Bytes |TempSpc| Cost (%CPU)|Time      |
---------------------------------------------------------------------------------------------------------
|  1 |    NESTED LOOPS                |                 |       |       |       |            |          |
|  2 |     NESTED LOOPS               |                 |   233 | 15611 |       |  1469   (2)| 00:00:18 |
|* 3 |      HASH JOIN                 |                 |   233 | 10951 |       |  1003   (3)| 00:00:13 |
|* 4 |       TABLE ACCESS FULL        | TBLREQUIREMENTS |  5031 | 60372 |       |   980   (3)| 00:00:12 |
|  5 |       MERGE JOIN CARTESIAN     |                 |  3061 |   104K|       |    22   (0)| 00:00:01 |
|* 6 |        TABLE ACCESS FULL       | TBLWARDSHIFTS   |     1 |    20 |       |     3   (0)| 00:00:01 |
|  7 |        BUFFER SORT             |                 |  4639 | 69585 |       |    19   (0)| 00:00:01 |
|* 8 |         TABLE ACCESS FULL      | TBLAVAILABILITY |  4639 | 69585 |       |    19   (0)| 00:00:01 |
|* 9 |      INDEX UNIQUE SCAN         | PK5             |     1 |       |       |     1   (0)| 00:00:01 |
| 10 |     TABLE ACCESS BY INDEX ROWID| TBLPERSON       |     1 |    20 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - access("TBLAVAILABILITY"."ADATE"="TBLREQUIREMENTS"."RDATE" AND
      "TBLREQUIREMENTS"."WARDSHIFTID"="TBLWARDSHIFTS"."WARDSHIFTID")
  4 - filter(TO_CHAR(INTERNAL_FUNCTION("TBLREQUIREMENTS"."RDATE"),'MM')=TO_CHAR(SYSDATE@!,'MM'))
  6 - filter(TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("TBLWARDSHIFTS"."SHIFTSTART"),'HH24'))>=15 AND
      TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("TBLWARDSHIFTS"."SHIFTSTART"),'HH24'))<20)
  8 - filter("TBLAVAILABILITY"."ANYLATE"=1)
  9 - access("TBLAVAILABILITY"."PERSONID"="TBLPERSON"."PERSONID")

We can see that all of the tables are linked:

TBLAVAILABILITY ~ TBLREQUIREMENTS
TBLAVAILABILITY ~ TBLPERSON
TBLREQUIREMENTS ~ TBLWARDSHIFTS

So, what could be leading Oracle to calculate that the predicates applied to TBLWARDSHIFTS will cause only a single row to be returned?  Might it be the functions that are applied to the TBLWARDSHIFTS.SHIFTSTART column and/or the two inequalities applied to that column in the WHERE clause?  What about out of date statistics?  Certainly, a 10053 trace might help solve part of the mystery.  An interesting comment by Tanel Poder in this blog article suggests that a MERGE JOIN CARTESIAN is similar to a NESTED LOOP operation, just without filtering, so maybe Cartesian joins are not all that bad.

The above example shows a case where a Cartesian join may hurt performance, but can it also help performance?  In this blog article comment I provided a test case that included unhinted and hinted (forced) execution plans with and without Cartesian joins.  For example, in one 11.1.0.7 database, the following SQL statement using the tables T1, T2, T3, and T4 from my previous blog article comment:

SELECT
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3) 
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

Produced this execution plan:

Plan hash value: 348785823

---------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  2000K|   427M|       | 24447   (2)| 00:01:39 |
|*  1 |  HASH JOIN             |      |  2000K|   427M|  2488K| 24447   (2)| 00:01:39 |
|*  2 |   HASH JOIN            |      | 20000 |  2246K|       | 10984   (2)| 00:00:45 |
|   3 |    MERGE JOIN CARTESIAN|      |     2 |    12 |       |     5  (20)| 00:00:01 |
|   4 |     SORT UNIQUE        |      |     2 |     6 |       |     2   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL | T4   |     2 |     6 |       |     2   (0)| 00:00:01 |
|   6 |     BUFFER SORT        |      |     2 |     6 |       |            |          |
|   7 |      SORT UNIQUE       |      |     2 |     6 |       |     2   (0)| 00:00:01 |
|   8 |       TABLE ACCESS FULL| T3   |     2 |     6 |       |     2   (0)| 00:00:01 |
|   9 |    TABLE ACCESS FULL   | T2   |  1000K|   103M|       | 10967   (1)| 00:00:45 |
|  10 |   TABLE ACCESS FULL    | T1   |  1000K|   103M|       | 10967   (1)| 00:00:45 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

Interesting, a merge join Cartesian operation between two unrelated tables – the tables in the subqueries that applied to tables T1 and T2.  Another Oracle 11.1.0.7 database produced this execution plan for the same unhinted SQL statement:

Plan hash value: 1754840566

----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes |TempSpc|Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |  4000K|   854M|       |42042   (4)| 00:08:25 |
|*  1 |  HASH JOIN RIGHT SEMI          |           |  4000K|   854M|       |42042   (4)| 00:08:25 |
|   2 |   TABLE ACCESS FULL            | T3        |     2 |     6 |       |    3   (0)| 00:00:01 |
|*  3 |   HASH JOIN                    |           |   200M|    41G|  2424K|41385   (2)| 00:08:17 |
|   4 |    NESTED LOOPS                |           |       |       |       |           |          |
|   5 |     NESTED LOOPS               |           | 20000 |  2187K|       |10026   (1)| 00:02:01 |
|   6 |      SORT UNIQUE               |           |     2 |     6 |       |    3   (0)| 00:00:01 |
|   7 |       TABLE ACCESS FULL        | T4        |     2 |     6 |       |    3   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN          | IND_T2_C2 | 10000 |       |       |   20   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID| T2        | 10000 |  1064K|       |10022   (1)| 00:02:01 |
|  10 |    TABLE ACCESS FULL           | T1        |  1000K|   103M|       |24864   (1)| 00:04:59 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="C1")
   3 - access("T1"."C2"="T2"."C2")
   8 - access("T2"."C2"="C1")

Notice that the MERGE JOIN CARTESIAN operation did not appear this time.  Despite the output of the Time column, the second server is considerably faster than the first.  To obtain the same execution plan on the second server I had to supply the following hint:

/*+ LEADING(T4, T3, T2, T1) USE_HASH(T1) USE_HASH(T2) */

With the above hint added to the SQL statement, the execution plan on the second server looked like this:

Plan hash value: 348785823

--------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  4000K|   854M|       | 55716   (1)|00:11:09 |
|*  1 |  HASH JOIN             |      |  4000K|   854M|  4968K| 55716   (1)|00:11:09 |
|*  2 |   HASH JOIN            |      | 40000 |  4492K|       | 24874   (1)|00:04:59 |
|   3 |    MERGE JOIN CARTESIAN|      |     4 |    24 |       |     7  (15)|00:00:01 |
|   4 |     SORT UNIQUE        |      |     2 |     6 |       |     3   (0)|00:00:01 |
|   5 |      TABLE ACCESS FULL | T4   |     2 |     6 |       |     3   (0)|00:00:01 |
|   6 |     BUFFER SORT        |      |     2 |     6 |       |     4  (25)|00:00:01 |
|   7 |      SORT UNIQUE       |      |     2 |     6 |       |     3   (0)|00:00:01 |
|   8 |       TABLE ACCESS FULL| T3   |     2 |     6 |       |     3   (0)|00:00:01 |
|   9 |    TABLE ACCESS FULL   | T2   |  1000K|   103M|       | 24864   (1)|00:04:59 |
|  10 |   TABLE ACCESS FULL    | T1   |  1000K|   103M|       | 24864   (1)|00:04:59 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

As you can see, the predicted execution time is longer.  But, how accurate is the time prediction?  To make this a little more interesting, let’s try a couple of more hints on the second server:

/*+ LEADING(T1, T3, T4) */

Execution Plan
----------------------------------------------------------
Plan hash value: 3921125646

--------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |  2020K|   431M|       | 72263   (1)|00:14:28 |
|*  1 |  HASH JOIN            |      |  2020K|   431M|  2512K| 72263   (1)|00:14:28 |
|   2 |   MERGE JOIN CARTESIAN|      | 20202 |  2268K|       | 41547   (1)|00:08:19 |
|*  3 |    HASH JOIN SEMI     |      | 10101 |  1104K|   115M| 30593   (1)|00:06:08 |
|   4 |     TABLE ACCESS FULL | T1   |  1000K|   103M|       | 24864   (1)|00:04:59 |
|   5 |     TABLE ACCESS FULL | T3   |     2 |     6 |       |     3   (0)|00:00:01 |
|   6 |    BUFFER SORT        |      |     2 |     6 |       | 41544   (1)|00:08:19 |
|   7 |     SORT UNIQUE       |      |     2 |     6 |       |     1   (0)|00:00:01 |
|   8 |      TABLE ACCESS FULL| T4   |     2 |     6 |       |     1   (0)|00:00:01 |
|   9 |   TABLE ACCESS FULL   | T2   |  1000K|   103M|       | 24864   (1)|00:04:59 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T2"."C2"="C1")
   3 - access("T1"."C2"="C1")

/*+ LEADING(T1, T3, T2) */

Plan hash value: 1368374064

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  2020K|   431M|       | 56120   (2)|00:11:14 |
|*  1 |  HASH JOIN RIGHT SEMI|      |  2020K|   431M|       | 56120   (2)|00:11:14 |
|   2 |   TABLE ACCESS FULL  | T4   |     2 |     6 |       |     3   (0)|00:00:01 |
|*  3 |   HASH JOIN          |      |   101M|    20G|       | 55787   (1)|00:11:10 |
|*  4 |    HASH JOIN SEMI    |      | 10101 |  1104K|   115M| 30593   (1)|00:06:08 |
|   5 |     TABLE ACCESS FULL| T1   |  1000K|   103M|       | 24864   (1)|00:04:59 |
|   6 |     TABLE ACCESS FULL| T3   |     2 |     6 |       |     3   (0)|00:00:01 |
|   7 |    TABLE ACCESS FULL | T2   |  1000K|   103M|       | 24864   (1)|00:04:59 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."C2"="C1")
   3 - access("T1"."C2"="T2"."C2")
   4 - access("T1"."C2"="C1")

Note that there was no Cartesian join in the above execution plan – that was the first hinted execution plan on the first server that did not yield a Cartesian join.

/*+ LEADING(T3, T4) */

Execution Plan
----------------------------------------------------------
Plan hash value: 2581882832

--------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes |TempSpc|Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           |  2000K|   427M|       |50767   (1)| 00:10:10 |
|*  1 |  HASH JOIN                    |           |  2000K|   427M|  2488K|50767   (1)| 00:10:10 |
|   2 |   NESTED LOOPS                |           |       |       |       |           |          |
|   3 |    NESTED LOOPS               |           | 20000 |  2246K|       |20052   (1)| 00:04:01 |
|   4 |     MERGE JOIN CARTESIAN      |           |     2 |    12 |       |    7  (15)| 00:00:01 |
|   5 |      SORT UNIQUE              |           |     2 |     6 |       |    3   (0)| 00:00:01 |
|   6 |       TABLE ACCESS FULL       | T3        |     2 |     6 |       |    3   (0)| 00:00:01 |
|   7 |      BUFFER SORT              |           |     2 |     6 |       |           |          |
|   8 |       SORT UNIQUE             |           |     2 |     6 |       |    3   (0)| 00:00:01 |
|   9 |        TABLE ACCESS FULL      | T4        |     2 |     6 |       |    3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | IND_T2_C2 | 10000 |       |       |   20   (0)| 00:00:01 |
|  11 |    TABLE ACCESS BY INDEX ROWID| T2        | 10000 |  1064K|       |10022   (1)| 00:02:01 |
|  12 |   TABLE ACCESS FULL           | T1        |  1000K|   103M|       |24864   (1)| 00:04:59 |
--------------------------------------------------------------------------------------------------

Unfortunately, for the last execution plan I did not capture the predicate information section.

We need a script to determine which plan is fastest – this script will be run directly on the second server so that we eliminate the effects of network traffic.

SET AUTOTRACE TRACEONLY STATISTICS
SET ARRAYSIZE 1000
ALTER SESSION SET STATISTICS_LEVEL='ALL';
SET TIMING ON

SPOOL SMALLEST_TEST.TXT

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T1, T3, T4) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T1, T3, T2) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T3, T4) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T4, T3, T2, T1) USE_HASH(T1) USE_HASH(T2) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

SPOOL OFF

So, what was the output of the script?

SQL> SELECT
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:09:15.90

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     129202  consistent gets
     101906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

---

SQL> SELECT /*+ LEADING(T1, T3, T4) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:09:59.15

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     200126  consistent gets
     181906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

---

SQL> SELECT /*+ LEADING(T1, T3, T2) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:10:18.07

---

SQL> SELECT /*+ LEADING(T3, T4) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:09:51.72

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     149245  consistent gets
     101906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

---

SQL> SELECT /*+ LEADING(T4, T3, T2, T1) USE_HASH(T1) USE_HASH(T2) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:08:47.07

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     200126  consistent gets
     181906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

The result?  The heavily hinted execution plan that reproduced the execution plan containing the Cartesian join found on the first server completed roughly 29 seconds faster than the unhinted execution plan that did not use a Cartesian join.  Not all Cartesian joins are bad.

You are probably wondering what the output from the first server looked like.  I think that the best answer is that the results are “in-doubt”.  The first server was queried over the network with the default array fetch size of 15.  The results follow (note that the execution plans may be different than those found on the second server):

SQL> SELECT /*+ LEADING(T1, T3, T4) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 01:02:34.25

Statistics
----------------------------------------------------------
        114  recursive calls
          0  db block gets
     200110  consistent gets
     189480  physical reads
          0  redo size
SP2-0642: SQL*Plus internal error state 1075, context 1:5:4294967295
Unsafe to proceed
  146667044  bytes received via SQL*Net from client
   13333335  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

That output does not look good.  Should we continue?

SQL> SELECT /*+ LEADING(T1, T3, T2) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 01:04:18.54

Statistics
----------------------------------------------------------
        115  recursive calls
          0  db block gets
     200110  consistent gets
     189480  physical reads
          0  redo size
SP2-0642: SQL*Plus internal error state 1075, context 1:5:4294967295
Unsafe to proceed
  146667044  bytes received via SQL*Net from client
   13333335  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

Unsafe to proceed?  OK, enough pain.  Let’s check the execution plans from the first server:

SET AUTOTRACE OFF
SET PAGESIZE 1000

SET LINESIZE 160

SELECT /*+ LEADING(S) */
  T.PLAN_TABLE_OUTPUT
FROM
  (SELECT
    SQL_ID,
    CHILD_NUMBER
  FROM
    V$SQL
  WHERE
    SQL_TEXT LIKE 'SELECT%'
    AND SQL_TEXT LIKE '%T1.C2=T2.C2%'
    AND SQL_TEXT LIKE '%T4)') S,
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(S.SQL_ID,S.CHILD_NUMBER,'ALLSTATS LAST +COST')) T;

The output (note that the STATISTICS_LEVEL was changed from TYPICAL to ALL when the queries were actually executed, hence the multiple child cursors for some of the SQL statements, and the warning that appears in the Note section for some of the execution plans):

SQL_ID  b0pk35tnkh80n, child number 0
-------------------------------------
SELECT /*+ LEADING(T1, T3, T2) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
  T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 1368374064

--------------------------------------------------------------------------------------
| Id  | Operation            | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |        | 29186 (100)|       |       |          |
|*  1 |  HASH JOIN RIGHT SEMI|      |   2020K| 29186   (9)|   935K|   935K|          |
|   2 |   TABLE ACCESS FULL  | T4   |      2 |     2   (0)|       |       |          |
|*  3 |   HASH JOIN          |      |    101M| 27997   (6)|  1949K|   945K|          |
|*  4 |    HASH JOIN SEMI    |      |  10101 | 13397   (1)|   133M|  7573K|          |
|   5 |     TABLE ACCESS FULL| T1   |   1000K| 10967   (1)|       |       |          |
|   6 |     TABLE ACCESS FULL| T3   |      2 |     2   (0)|       |       |          |
|   7 |    TABLE ACCESS FULL | T2   |   1000K| 10967   (1)|       |       |          |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."C2"="C1")
   3 - access("T1"."C2"="T2"."C2")
   4 - access("T1"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID  bkdkvvppuhta3, child number 0
-------------------------------------
SELECT   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,   T2.C3 T2_C3 FROM   T1,
T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (     SELECT       C1     FROM
   T3)   AND T2.C2 IN (     SELECT       C1     FROM       T4)

Plan hash value: 348785823

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |        | 24447 (100)|       |       |          |
|*  1 |  HASH JOIN             |      |   2000K| 24447   (2)|  3361K|   944K|          |
|*  2 |   HASH JOIN            |      |  20000 | 10984   (2)|   878K|   878K|          |
|   3 |    MERGE JOIN CARTESIAN|      |      2 |     5  (20)|       |       |          |
|   4 |     SORT UNIQUE        |      |      2 |     2   (0)| 73728 | 73728 |          |
|   5 |      TABLE ACCESS FULL | T4   |      2 |     2   (0)|       |       |          |
|   6 |     BUFFER SORT        |      |      2 |            | 73728 | 73728 |          |
|   7 |      SORT UNIQUE       |      |      2 |     2   (0)| 73728 | 73728 |          |
|   8 |       TABLE ACCESS FULL| T3   |      2 |     2   (0)|       |       |          |
|   9 |    TABLE ACCESS FULL   | T2   |   1000K| 10967   (1)|       |       |          |
|  10 |   TABLE ACCESS FULL    | T1   |   1000K| 10967   (1)|       |       |          |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID  bkdkvvppuhta3, child number 1
-------------------------------------
SELECT   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,   T2.C3 T2_C3 FROM   T1,
T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (     SELECT       C1     FROM
   T3)   AND T2.C2 IN (     SELECT       C1     FROM       T4)

Plan hash value: 348785823

------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        | 24447 (100)|    200M|00:00:17.51 |     200K|    181K|       |       |          |
|*  1 |  HASH JOIN             |      |      1 |   2000K| 24447   (2)|    200M|00:00:17.51 |     200K|    181K|  6223K|  1897K| 6684K (0)|
|*  2 |   HASH JOIN            |      |      1 |  20000 | 10984   (2)|  40000 |00:00:15.50 |   90967 |  90960 |   968K|   968K|  464K (0)|
|   3 |    MERGE JOIN CARTESIAN|      |      1 |      2 |     5  (20)|      4 |00:00:00.02 |       6 |      4 |       |       |          |
|   4 |     SORT UNIQUE        |      |      1 |      2 |     2   (0)|      2 |00:00:00.02 |       3 |      2 |  2048 |  2048 | 2048  (0)|
|   5 |      TABLE ACCESS FULL | T4   |      1 |      2 |     2   (0)|      2 |00:00:00.02 |       3 |      2 |       |       |          |
|   6 |     BUFFER SORT        |      |      2 |      2 |            |      4 |00:00:00.01 |       3 |      2 |  2048 |  2048 | 2048  (0)|
|   7 |      SORT UNIQUE       |      |      1 |      2 |     2   (0)|      2 |00:00:00.01 |       3 |      2 |  2048 |  2048 | 2048  (0)|
|   8 |       TABLE ACCESS FULL| T3   |      1 |      2 |     2   (0)|      2 |00:00:00.01 |       3 |      2 |       |       |          |
|   9 |    TABLE ACCESS FULL   | T2   |      1 |   1000K| 10967   (1)|   1000K|00:00:15.02 |   90961 |  90956 |       |       |          |
|  10 |   TABLE ACCESS FULL    | T1   |      1 |   1000K| 10967   (1)|   1000K|00:00:01.02 |     109K|  90956 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

SQL_ID  2drtx01hd21jb, child number 0
-------------------------------------
SELECT /*+ LEADING(T1, T3, T4) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
  T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 3921125646

---------------------------------------------------------------------------------------
| Id  | Operation             | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |        | 31686 (100)|       |       |          |
|*  1 |  HASH JOIN            |      |   2020K| 31686   (1)|  3420K|   944K|          |
|   2 |   MERGE JOIN CARTESIAN|      |  20202 | 18221   (1)|       |       |          |
|*  3 |    HASH JOIN SEMI     |      |  10101 | 13397   (1)|   133M|  7573K|          |
|   4 |     TABLE ACCESS FULL | T1   |   1000K| 10967   (1)|       |       |          |
|   5 |     TABLE ACCESS FULL | T3   |      2 |     2   (0)|       |       |          |
|   6 |    BUFFER SORT        |      |      2 | 18219   (1)| 73728 | 73728 |          |
|   7 |     SORT UNIQUE       |      |      2 |     0   (0)| 73728 | 73728 |          |
|   8 |      TABLE ACCESS FULL| T4   |      2 |     0   (0)|       |       |          |
|   9 |   TABLE ACCESS FULL   | T2   |   1000K| 10967   (1)|       |       |          |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T2"."C2"="C1")
   3 - access("T1"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID  2drtx01hd21jb, child number 1
-------------------------------------
SELECT /*+ LEADING(T1, T3, T4) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
  T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 3921125646

------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        | 31686 (100)|    200M|00:00:29.65 |     200K|    189K|  13702 |       |       |          |         |
|*  1 |  HASH JOIN            |      |      1 |   2020K| 31686   (1)|    200M|00:00:29.65 |     200K|    189K|  13702 |  6223K|  1897K| 6659K (0)|         |
|   2 |   MERGE JOIN CARTESIAN|      |      1 |  20202 | 18221   (1)|  40000 |00:00:27.35 |   90967 |  98524 |  13702 |       |       |          |         |
|*  3 |    HASH JOIN SEMI     |      |      1 |  10101 | 13397   (1)|  20000 |00:00:27.31 |   90964 |  98522 |  13702 |   126M|  7603K|   83  (1)|     114K|
|   4 |     TABLE ACCESS FULL | T1   |      1 |   1000K| 10967   (1)|   1000K|00:00:17.00 |   90961 |  90956 |      0 |       |       |          |         |
|   5 |     TABLE ACCESS FULL | T3   |      1 |      2 |     2   (0)|      2 |00:00:00.16 |       3 |      2 |      0 |       |       |          |         |
|   6 |    BUFFER SORT        |      |  20000 |      2 | 18219   (1)|  40000 |00:00:00.04 |       3 |      2 |      0 |  2048 |  2048 | 2048  (0)|         |
|   7 |     SORT UNIQUE       |      |      1 |      2 |     0   (0)|      2 |00:00:00.04 |       3 |      2 |      0 |  2048 |  2048 | 2048  (0)|         |
|   8 |      TABLE ACCESS FULL| T4   |      1 |      2 |     0   (0)|      2 |00:00:00.04 |       3 |      2 |      0 |       |       |          |         |
|   9 |   TABLE ACCESS FULL   | T2   |      1 |   1000K| 10967   (1)|   1000K|00:00:01.02 |     109K|  90956 |      0 |       |       |          |         |
------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T2"."C2"="C1")
   3 - access("T1"."C2"="C1")

SQL_ID  91wazrzfau2wq, child number 0
-------------------------------------
SELECT /*+ LEADING(T3, T4) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 331368001

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |        | 24447 (100)|       |       |          |
|*  1 |  HASH JOIN             |      |   2000K| 24447   (2)|  3361K|   944K|          |
|*  2 |   HASH JOIN            |      |  20000 | 10984   (2)|   878K|   878K|          |
|   3 |    MERGE JOIN CARTESIAN|      |      2 |     5  (20)|       |       |          |
|   4 |     SORT UNIQUE        |      |      2 |     2   (0)| 73728 | 73728 |          |
|   5 |      TABLE ACCESS FULL | T3   |      2 |     2   (0)|       |       |          |
|   6 |     BUFFER SORT        |      |      2 |            | 73728 | 73728 |          |
|   7 |      SORT UNIQUE       |      |      2 |     2   (0)| 73728 | 73728 |          |
|   8 |       TABLE ACCESS FULL| T4   |      2 |     2   (0)|       |       |          |
|   9 |    TABLE ACCESS FULL   | T2   |   1000K| 10967   (1)|       |       |          |
|  10 |   TABLE ACCESS FULL    | T1   |   1000K| 10967   (1)|       |       |          |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

I will repeat – not all Cartesian joins are bad for performance.  But, of course, it is not always obvious which Cartesian joins are good, which are bad, and which are just OK without testing.








Follow

Get every new post delivered to your Inbox.

Join 143 other followers