Ramming Two SQL Statements Together – To Scalar or Too Scalar

30 05 2010

May 30, 2010

A recent comp.databases.oracle.server message thread about combining SQL statements triggered a bit of testing.  In light of some of my later comments in that thread, I thought I might start out by first showing the summary of the performance results.

In the above, the left side shows the TKPROF output from running a query that uses a standard join between four tables, but would probably behave about the same if it was a query with one table and the other three tables were specified in inline views.  The right side shows a query that produces identical results as the first, but uses SELECT statements in three columns of the parent SELECT statement that accesses a single table – there are three scalar (thanks Joel for looking up that term) subqueries in the SQL statement.  Each row in the above output shows what happened to the statistics output by TKPROF when the range of rows selected increased from 6,366 to 10,000,000 by changing the WHERE clause so that the upper range of the range scan in the query increased from 10 to 10,000.

By the end of the Server 1 test, where the test was executed directly on the server, the first query completed roughly 4.49 times faster than the second query, and required 102.27 times fewer consistent gets than the second query.

By the end of the Server 2 test, where the test was executed remotely across a gigabit network against a much slower server with 1/10 the number of rows in the main table, the first query completed roughly 6.27 times faster, and required 121.08 times fewer consistent gets than the second query.

The obvious conclusion is that scalar subqueries should not be used in column positions because they are much slower, and require a lot more consistent gets.

——–

I hear someone in the back shouting, “But, but… where is your evidence?  How can I verify that what you state to be correct is correct.  How can I be satisfied that my server isn’t substantially slower than your server 2?  Why were there all of those physical reads?  What were the execution plans, and did you see any unexpected Cartesian joins?  What did the SQL statements look like?  What were the table definitions”

Is it not enough to state that I achieved a 627% performance improvement by just making a small change to a SQL statement?  :-)  Of course not – switching now to the approach used by the other book that I have on pre-order.

Here is the test case that was executed for the remote test with Server 2:

CREATE TABLE T1 (
  ID NUMBER,
  DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
  CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
  'This is the long description for this number '|| TO_CHAR(CEIL(ABS(SIN(ROWNUM/9.9999)*10000)))
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000),
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000);

CREATE INDEX IND_T1 ON T1(ID);

CREATE TABLE T2 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

CREATE TABLE T3 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

CREATE TABLE T4 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

CREATE INDEX IND_T4 ON T4(C1);

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,METHOD_OPT=>'FOR ALL COLUMNS SIZE 1')
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T4',CASCADE=>TRUE)

There are just 1,000,000 rows in table T1, and I did not allow the optimizer to create a histogram on the columns in table T1.  Now, let’s try a test script using the tables to see the expected execution plans (on Oracle Database 11.1.0.7):

SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 200
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 200;

The first of the SQL statement directly joins the tables, while the second SQL statement places SELECT statements in column positions (scalar subqueries).  The output (11.1.0.7):

SQL> SELECT
  2    T1.ID,
  3    T2.C1 T2_C1,
  4    T3.C1 T3_C1,
  5    T4.C1 T4_C1
  6  FROM
  7    T1,
  8    T2,
  9    T3,
 10    T4
 11  WHERE
 12    T2.C1 BETWEEN 1 AND 200
 13    AND T2.C1=T3.C1
 14    AND T2.C1=T4.C1
 15    AND T2.C1=T1.ID;

Execution Plan
----------------------------------------------------------
Plan hash value: 3780653648

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |   457 |  7312 | 10041   (1)| 00:00:41 |
|*  1 |  HASH JOIN           |        |   457 |  7312 | 10041   (1)| 00:00:41 |
|*  2 |   HASH JOIN          |        |   198 |  2376 |    46   (5)| 00:00:01 |
|*  3 |    HASH JOIN         |        |   199 |  1592 |    43   (3)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| T2     |   200 |   800 |    21   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| T3     |   200 |   800 |    21   (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN  | IND_T4 |   200 |   800 |     2   (0)| 00:00:01 |
|*  7 |   TABLE ACCESS FULL  | T1     | 20002 | 80008 |  9994   (1)| 00:00:41 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."C1"="T1"."ID")
   2 - access("T2"."C1"="T4"."C1")
   3 - access("T2"."C1"="T3"."C1")
   4 - filter("T2"."C1"<=200 AND "T2"."C1">=1)
   5 - filter("T3"."C1"<=200 AND "T3"."C1">=1)
   6 - access("T4"."C1">=1 AND "T4"."C1"<=200)
   7 - filter("T1"."ID"<=200 AND "T1"."ID">=1)

SQL> SELECT
  2    T1.ID,
  3    (SELECT
  4      T2.C1
  5    FROM
  6      T2
  7    WHERE
  8      T1.ID=T2.C1) T2_C1,
  9    (SELECT
 10      T3.C1
 11    FROM
 12      T3
 13    WHERE
 14      T1.ID=T3.C1) T3_C1,
 15    (SELECT
 16      T4.C1
 17    FROM
 18      T4
 19    WHERE
 20      T1.ID=T4.C1) T4_C1
 21  FROM
 22    T1
 23  WHERE
 24    T1.ID BETWEEN 1 AND 200;

Execution Plan
----------------------------------------------------------
Plan hash value: 2945978589

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        | 20002 | 80008 |  9994   (1)| 00:00:41 |
|*  1 |  TABLE ACCESS FULL| T2     |     1 |     4 |    21   (0)| 00:00:01 |
|*  2 |  TABLE ACCESS FULL| T3     |     1 |     4 |    21   (0)| 00:00:01 |
|*  3 |  INDEX RANGE SCAN | IND_T4 |     1 |     4 |     1   (0)| 00:00:01 |
|*  4 |  TABLE ACCESS FULL| T1     | 20002 | 80008 |  9994   (1)| 00:00:41 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T2"."C1"=:B1)
   2 - filter("T3"."C1"=:B1)
   3 - access("T4"."C1"=:B1)
   4 - filter("T1"."ID"<=200 AND "T1"."ID">=1)

The execution plans look a bit different, and notice that the second execution plan shows bind variables in the Predicate Information section.  If we were to actually run the SQL statements, we might find that the first runs in about 15 seconds and the second in about 16-17 seconds, both with the same number of physical reads.  That is no fun, so let’s change the number 200 to 1200 to see what happens.  We will flush the buffer cache twice between executions to force physical reads for both executions (Oracle is set to use direct, asynchronous IO), and set the array fetch size to 1,000 to minimize the amount of unnecessary network traffic.  The test script follows:

SET TIMING ON
SET AUTOTRACE TRACEONLY STATISTICS;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
SET ARRAYSIZE 1000

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 1200
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 1200;

Here is the output of the above script:

SQL> SELECT
  2    T1.ID,
  3    T2.C1 T2_C1,
  4    T3.C1 T3_C1,
  5    T4.C1 T4_C1
  6  FROM
  7    T1,
  8    T2,
  9    T3,
 10    T4
 11  WHERE
 12    T2.C1 BETWEEN 1 AND 1200
 13    AND T2.C1=T3.C1
 14    AND T2.C1=T4.C1
 15    AND T2.C1=T1.ID;

76580 rows selected.

Elapsed: 00:00:15.96

Statistics
---------------------------------------------------
          1  recursive calls
          0  db block gets
      83197  consistent gets
      83110  physical reads
          0  redo size
    1288037  bytes sent via SQL*Net to client
       1217  bytes received via SQL*Net from client
         78  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      76580  rows processed

SQL> SELECT
  2    T1.ID,
  3    (SELECT
  4      T2.C1
  5    FROM
  6      T2
  7    WHERE
  8      T1.ID=T2.C1) T2_C1,
  9    (SELECT
 10      T3.C1
 11    FROM
 12      T3
 13    WHERE
 14      T1.ID=T3.C1) T3_C1,
 15    (SELECT
 16      T4.C1
 17    FROM
 18      T4
 19    WHERE
 20      T1.ID=T4.C1) T4_C1
 21  FROM
 22    T1
 23  WHERE
 24    T1.ID BETWEEN 1 AND 1200;

76580 rows selected.

Elapsed: 00:01:40.09

Statistics
---------------------------------------------------
          1  recursive calls
          0  db block gets
   10073639  consistent gets
      83110  physical reads
          0  redo size
    1288037  bytes sent via SQL*Net to client
       1217  bytes received via SQL*Net from client
         78  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      76580  rows processed

The number of consistent gets jumps significantly for the second SQL statement and so did the execution time (and CPU usage).  Here is the autotrace statistics for the second SQL statement (for comparison) when T1.ID BETWEEN 1 AND 200 was specified:
12732 rows selected.

Elapsed: 00:00:17.54

Statistics
---------------------------------------------------
          0  recursive calls
          0  db block gets
     522390  consistent gets
      83108  physical reads
          0  redo size
     196813  bytes sent via SQL*Net to client
        513  bytes received via SQL*Net from client
         14  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      12732  rows processed

While the above might be interesting, a slightly different table T1 was created for the Server 1 test script:

CREATE TABLE T1 (
  ID NUMBER,
  DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
  CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
  'This is the long description for this number '|| TO_CHAR(CEIL(ABS(SIN(ROWNUM/9.9999)*10000)))
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000),
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=10000);

CREATE INDEX IND_T1 ON T1(ID);

CREATE TABLE T2 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

CREATE TABLE T3 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

CREATE TABLE T4 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

CREATE INDEX IND_T4 ON T4(C1);

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T4',CASCADE=>TRUE)

This time I loaded table T1 with 10,000,000 rows and allowed the optimizer to collect a histogram on table T1, if it determined that a histogram would help.  The full test script on Server 1 is a bit long, so I will post an abbreviated version here.  First, let’s determine the execution plans for the SQL statements (Oracle Database 11.2.0.1 with 8,000MB SGA_TARGET):

SET AUTOTRACE TRACEONLY EXPLAIN
SET LINESIZE 160
SET TRIMSPOOL ON
SET PAGESIZE 2000
SPOOL SCALAR_EXECUTION_PLANS.TXT

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 10
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 10;

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 50
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 50;

...

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 10000
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 10000;

SPOOL OFF

Then to check the performance results, the script that actually executes the statements (Oracle Database 11.2.0.1 with 8,000MB SGA_TARGET):

SET TIMING ON
SET AUTOTRACE TRACEONLY STATISTICS;
SET ARRAYSIZE 1000
SPOOL SCALAR_TEST_RESULTS.TXT
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SCALAR_TEST';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 10
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 10;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 50
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 50;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

...

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  T2.C1 T2_C1,
  T3.C1 T3_C1,
  T4.C1 T4_C1
FROM
  T1,
  T2,
  T3,
  T4
WHERE
  T2.C1 BETWEEN 1 AND 10000
  AND T2.C1=T3.C1
  AND T2.C1=T4.C1
  AND T2.C1=T1.ID;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.ID,
  (SELECT
    T2.C1
  FROM
    T2
  WHERE
    T1.ID=T2.C1) T2_C1,
  (SELECT
    T3.C1
  FROM
    T3
  WHERE
    T1.ID=T3.C1) T3_C1,
  (SELECT
    T4.C1
  FROM
    T4
  WHERE
    T1.ID=T4.C1) T4_C1
FROM
  T1
WHERE
  T1.ID BETWEEN 1 AND 10000;

SPOOL OFF
SET AUTOTRACE OFF

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

It is a bit surprising to see the number of different execution plans that appear in the TKPROF output for the first query compared with the number that appear for the second query.  For example (the Rows statistic shows the actual number of rows returned for the execution plan line, while card shows the predicted number of rows), only showing the execution plan when it changed (notice at one point we see three sort-merge joins):

Rows     Row Source Operation
-------  ---------------------------------------------------
   6366  NESTED LOOPS  (cr=371 pr=350 pw=0 time=2208 us cost=121 size=208 card=13)
     10   HASH JOIN  (cr=325 pr=318 pw=0 time=27 us cost=97 size=96 card=8)
     10    HASH JOIN  (cr=316 pr=310 pw=0 time=9 us cost=95 size=72 card=9)
     10     TABLE ACCESS FULL T2 (cr=158 pr=155 pw=0 time=0 us cost=47 size=40 card=10)
     10     TABLE ACCESS FULL T3 (cr=158 pr=155 pw=0 time=0 us cost=47 size=40 card=10)
     10    INDEX RANGE SCAN IND_T4 (cr=9 pr=8 pw=0 time=0 us cost=2 size=40 card=10)(object id 82879)
   6366   INDEX RANGE SCAN IND_T1 (cr=46 pr=32 pw=0 time=1315 us cost=3 size=8 card=2)(object id 82875)

Rows     Row Source Operation
-------  ---------------------------------------------------
  31832  HASH JOIN  (cr=415 pr=398 pw=0 time=13733 us cost=208 size=3872 card=242)
     50   HASH JOIN  (cr=318 pr=318 pw=0 time=49 us cost=97 size=576 card=48)
     50    HASH JOIN  (cr=316 pr=310 pw=0 time=49 us cost=95 size=392 card=49)
     50     TABLE ACCESS FULL T2 (cr=158 pr=155 pw=0 time=0 us cost=47 size=200 card=50)
     50     TABLE ACCESS FULL T3 (cr=158 pr=155 pw=0 time=49 us cost=47 size=200 card=50)
     50    INDEX RANGE SCAN IND_T4 (cr=2 pr=8 pw=0 time=0 us cost=2 size=200 card=50)(object id 82879)
  31832   INDEX RANGE SCAN IND_T1 (cr=97 pr=80 pw=0 time=3465 us cost=110 size=200020 card=50005)(object id 82875)

Rows     Row Source Operation
-------  ---------------------------------------------------
 127334  HASH JOIN  (cr=706 pr=606 pw=0 time=176652 us cost=533 size=63392 card=3962)
    200   INDEX RANGE SCAN IND_T4 (cr=2 pr=8 pw=0 time=0 us cost=2 size=800 card=200)(object id 82879)
 127334   HASH JOIN  (cr=704 pr=598 pw=0 time=138605 us cost=530 size=47772 card=3981)
    200    TABLE ACCESS FULL T3 (cr=158 pr=155 pw=0 time=99 us cost=47 size=800 card=200)
 127334    MERGE JOIN  (cr=546 pr=443 pw=0 time=107989 us cost=482 size=32008 card=4001)
 127334     INDEX RANGE SCAN IND_T1 (cr=388 pr=288 pw=0 time=18939 us cost=434 size=800080 card=200020)(object id 82875)
 127334     SORT JOIN (cr=158 pr=155 pw=0 time=0 us cost=48 size=800 card=200)
    200      TABLE ACCESS FULL T2 (cr=158 pr=155 pw=0 time=0 us cost=47 size=800 card=200)

Rows     Row Source Operation
-------  ---------------------------------------------------
 894205  MERGE JOIN  (cr=3077 pr=2222 pw=0 time=2023576 us cost=3124 size=3132784 card=195799)
 894205   MERGE JOIN  (cr=3073 pr=2214 pw=0 time=1414521 us cost=3119 size=2351028 card=195919)
 894205    MERGE JOIN  (cr=2915 pr=2059 pw=0 time=767100 us cost=3071 size=1568312 card=196039)
 894205     INDEX RANGE SCAN IND_T1 (cr=2757 pr=1904 pw=0 time=111615 us cost=3023 size=5600560 card=1400140)(object id 82875)
 894205     SORT JOIN (cr=158 pr=155 pw=0 time=0 us cost=48 size=5600 card=1400)
   1400      TABLE ACCESS FULL T2 (cr=158 pr=155 pw=0 time=127 us cost=47 size=5600 card=1400)
 894205    SORT JOIN (cr=158 pr=155 pw=0 time=0 us cost=48 size=5600 card=1400)
   1400     TABLE ACCESS FULL T3 (cr=158 pr=155 pw=0 time=127 us cost=47 size=5600 card=1400)
 894205   SORT JOIN (cr=4 pr=8 pw=0 time=0 us cost=5 size=5600 card=1400)
   1400    INDEX RANGE SCAN IND_T4 (cr=4 pr=8 pw=0 time=127 us cost=4 size=5600 card=1400)(object id 82879)

Rows     Row Source Operation
-------  ---------------------------------------------------
1939734  HASH JOIN  (cr=23230 pr=21260 pw=0 time=929418 us cost=6000 size=14396160 card=899760)
   3000   HASH JOIN  (cr=343 pr=333 pw=0 time=2499 us cost=102 size=35988 card=2999)
   3000    INDEX FAST FULL SCAN IND_T4 (cr=27 pr=23 pw=0 time=374 us cost=7 size=12000 card=3000)(object id 82879)
   3000    HASH JOIN  (cr=316 pr=310 pw=0 time=1249 us cost=95 size=24000 card=3000)
   3000     TABLE ACCESS FULL T2 (cr=158 pr=155 pw=0 time=499 us cost=47 size=12000 card=3000)
   3000     TABLE ACCESS FULL T3 (cr=158 pr=155 pw=0 time=249 us cost=47 size=12000 card=3000)
1939734   INDEX FAST FULL SCAN IND_T1 (cr=22887 pr=20927 pw=0 time=287619 us cost=5888 size=12001200 card=3000300)(object id 82875)

Edit May 30, 2010: The above execution plans demonstrate the cost-based optimizer’s ability to adapt the execution plan operations as the projected data volumes increase.  Not all of the tables in this test case have indexes, and that was intentional to see how the lack of indexes on certain tables affected the execution plans.

The execution plan for the second query remained a bit more unchanged (again, only showing when the execution plan changed):

Rows     Row Source Operation
-------  ---------------------------------------------------
     10  TABLE ACCESS FULL T2 (cr=1580 pr=155 pw=0 time=0 us cost=47 size=4 card=1)
     10  TABLE ACCESS FULL T3 (cr=1580 pr=155 pw=0 time=0 us cost=47 size=4 card=1)
     10  INDEX RANGE SCAN IND_T4 (cr=12 pr=8 pw=0 time=0 us cost=1 size=4 card=1)(object id 82879)
   6366  INDEX RANGE SCAN IND_T1 (cr=22 pr=32 pw=0 time=389 us cost=24 size=40004 card=10001)(object id 82875)

Rows     Row Source Operation
-------  ---------------------------------------------------
   3000  TABLE ACCESS FULL T2 (cr=474000 pr=155 pw=0 time=0 us cost=47 size=4 card=1)
   3000  TABLE ACCESS FULL T3 (cr=474000 pr=155 pw=0 time=0 us cost=47 size=4 card=1)
   3000  INDEX RANGE SCAN IND_T4 (cr=1990 pr=16 pw=0 time=0 us cost=1 size=4 card=1)(object id 82879)
1939734  INDEX FAST FULL SCAN IND_T1 (cr=22887 pr=20927 pw=0 time=313731 us cost=5888 size=12001200 card=3000300)(object id 82875)

How frequently do people use scalar subqueries in column positions rather than inline views, and then wonder why performance is slower than expected?  Every once in a while a thread will appear on a discussion forum asking about performance problems with queries that include scalar subqueries.  For example:

The test script (strip the .doc extension): scalar_test_script.sql

The test results (Oracle Database 11.2.0.1 with 8,000MB SGA_TARGET):
The spool file: scalar_test_results_spool_or112.txt
The execution plans: scalar_test_execution_plans_or112.txt
The TKPROF summary: scalar_test_or112_tkprof.txt

The test results (Oracle Database 11.1.0.7 with 8,000MB SGA_TARGET – same server as 11.2.0.1 test and same test script):
The spool file: scalar_test_results_spool_or111.txt
The execution plans: scalar_test_execution_plans_or111.txt
The TKPROF summary: scalar_test_or111_tkprof.txt





Defy Logic – the Cost-Based Optimizer does Not Select the Lowest Cost Plan – Implicit Data Type Conversion

28 05 2010

May 28, 2010

How many times have you heard someone say:

If you need to store numbers in the database, store the numbers in a column with a numeric datatype – storing the numbers in a column with a VARCHAR2 datatype invites problems.

If you need to store dates in the database, store the dates in a column with a DATE datatype – storing the dates in a column with a VARCHAR2 datatype invites problems; storing the dates in a column with a numeric datatype invites problems.

Yes, it happens in production envionments

Why is my SQL statement executing so slowly.

Why am I receiving unexpected results from this query?

The same holds true when selecting data from tables. 

  • If a number happens to exist in a column with a VARCHAR2 datatype, and it is necessary to retrieve that row by specifying that column in the WHERE clause, make certain that either the bind variable is defined as a VARCHAR/VARCHAR2 or that the constant (literal) value is wrapped in single quotes so that it is treated as a VARCHAR2, rather than as a number.  Yes, this happens in real life, as demontrated in this very recent OTN thread.
  • If the column is a VARCHAR2 datatype, do not define the bind variable with a NVARCHAR2 datatype.  Yes, this happens in real life, as demonstrated in this recent OTN thread and this slightly older OTN thread.
  • If the column is a DATE datatype, do not define the bind variable with a VARCHAR2 datatype and do not pass in a constant as a VARCHAR2 (string).  Something is bound to go wrong at some point.  I am still trying to determine why this technique was demonstrated multiple times in the “Oracle SQL Recipes” book (I have not had a chance to finish reading this book, so I have not posted a review yet that draws attention to this bad practice).

In the comments section of the most recent True or False Quiz article I showed a couple of demonstrations why numbers should not be stored in VARCHAR2 columns, if only numbers are to be stored in that column.  One of my comments showed why, when the column datatype does not match the datatype of the constant, and one of those is a numeric, why the other entity is converted to a numeric, rather than converting the numeric to a VARCHAR2.  From that comment entry, consider the following:

CREATE TABLE T3(
  CHAR_COL VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (CHAR_COL));

INSERT INTO T3 VALUES('1','A');
INSERT INTO T3 VALUES('1.0','A');
INSERT INTO T3 VALUES('1.00','A');
INSERT INTO T3 VALUES('1.000','A');
INSERT INTO T3 VALUES('1.0000','A');
INSERT INTO T3 VALUES('1.00000','A');
INSERT INTO T3 VALUES('1.000000','A');
INSERT INTO T3 VALUES('1.0000000','A');
INSERT INTO T3 VALUES('1.00000000','A');

COMMIT;

Now consider the following three queries – would the developer expect the three queries to return the same result rows?

SELECT
  *
FROM
  T3
WHERE
  CHAR_COL=1;

CHAR_COL   C2
---------- -----
1          A
1.0        A
1.00       A
1.000      A
1.0000     A
1.00000    A
1.000000   A
1.0000000  A
1.00000000 A

9 rows selected.

---

SELECT
  *
FROM
  T3
WHERE
  TO_NUMBER(CHAR_COL)=1;

CHAR_COL   C2
---------- -----
1          A
1.0        A
1.00       A
1.000      A
1.0000     A
1.00000    A
1.000000   A
1.0000000  A
1.00000000 A

9 rows selected.

---

SELECT
  *
FROM
  T3
WHERE
  CHAR_COL=TO_CHAR(1);

CHAR_COL   C2
---------- -----
1          A

Notice that the last query only returned one row, while the other two queries returned nine rows.  So, why are VARCHAR2 columns always converted to numeric values, rather than numeric values converted to VARCHAR2?  If the number was automatically converted by Oracle into a character value, Oracle might need to test a nearly infinite number of 0 characters appended to the end of the converted value after the decimal point (up to the number of characters of precision) for a matching result – this extra work is avoided by converting the character value to a number.

I might have appeared to have drifted off the topic of this blog article, so now let’s see a case where Oracle’s cost-based optimizer does the impossible – it does not pick the execution plan with the lowest calculated cost for a simple SQL statement involving a single table.  This test case can be reproduced on Oracle Database 10.2.0.4 through 11.2.0.1 (and probably a couple of other releases as well).  This test case is from another one of my comments in the recent True or False Quiz article – the bonus question.  The test case:

CREATE TABLE T1 (
  CHAR_COL VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (CHAR_COL));

INSERT INTO
  T1
SELECT
  TO_CHAR(ROWNUM),
  RPAD('A',100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL <=1000000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)

SET AUTOTRACE TRACEONLY EXPLAIN

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'LOWEST_COST';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /* FIND_ME */
  *
FROM
  T1
WHERE
  CHAR_COL = 10;

SELECT /*+ INDEX(T1) */  /* FIND_ME */
  *
FROM
  T1
WHERE
  CHAR_COL = 10;

SELECT /* FIND_ME */
  *
FROM
  T1
WHERE
  CHAR_COL = '10';

SET AUTOTRACE OFF
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

The test case creates a simple table with 1,000,000 rows and then executes three SELECT statements.  Page 477 of Tom Kyte’s “Expert Oracle Database Architecture” book states the following:

Case 4: We have indexed a character column. This column contains only numeric data. We query using the following syntax:

select * from t where indexed_column = 5

Note that the number 5 in the query is the constant number 5 (not a character string). The index on INDEXED_COLUMN is not used…”

Jonathan Lewis stated here (a blog article from 2006) that:

“It’s a great shame that Oracle Corp. decided to use the name “hints” for its optimizer directive mechanism.  “Hints” are not hints, they are interception points in the optimizer code path, and must be obeyed.”

So, which expert’s explanation is correct for Oracle Database 10.2.0.4 through 11.2.0.1 for this particular test case?  Neither?  Both?  Let’s take a look at the output that was written to the SQL*Plus screen:

SQL> SELECT /* FIND_ME */
  2    *
  3  FROM
  4    T1
  5  WHERE
  6    CHAR_COL = 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   108 |  4450   (2)| 00:00:54 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   108 |  4450   (2)| 00:00:54 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(TO_NUMBER("CHAR_COL")=10)

The above shows that Tom Kyte’s book is correct – the primary key index on the column CHAR_COL is not used.  Continuing with the SQL*Plus output:

SQL> SELECT /*+ INDEX(T1) */  /* FIND_ME */
  2    *
  3  FROM
  4    T1
  5  WHERE
  6    CHAR_COL = 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 458899268

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   108 |  3961   (2)| 00:00:48 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |     1 |   108 |  3961   (2)| 00:00:48 |
|*  2 |   INDEX FULL SCAN           | SYS_C0010157 |     1 |       |  3960   (2)| 00:00:48 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TO_NUMBER("CHAR_COL")=10)

The above SQL statement is the same as the first, just with an INDEX hint.  The index hint was obeyed.  Jonathan Lewis’ blog article is correct – index hints are directives, so Oracle’s optimizer selected an index access path using the only available index – the index on the primary key column.

But now we have a serious problem.  What is the problem?  For a SQL statement involving a single table a higher calculated cost execution plan (4,450) with a higher estimated time (54 seconds) was selected rather than using the obviously less expensive execution plan with a lower calculated cost (3,961) and with a lower estimated time (48 seconds).  Interesting…

The SQL*Plus output continues:

SQL> SELECT /* FIND_ME */
  2    *
  3  FROM
  4    T1
  5  WHERE
  6    CHAR_COL = '10';

Execution Plan
----------------------------------------------------------
Plan hash value: 1657849122

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |   108 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |     1 |   108 |     3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | SYS_C0010157 |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CHAR_COL"='10')

Notice that the calculated cost decreased significantly when we constructed the query correctly, and the available index access path was automatically selected.

So, why did the optimizer not select the lowest cost access path?  Fortunately, the test case created a 10053 trace file that helps explain what happened.  In the 10053 trace file, we see that the optimizer transformed the original SQL statement a bit:

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "T1"."CHAR_COL" "CHAR_COL","T1"."C2" "C2" FROM "TESTUSER"."T1" "T1" WHERE TO_NUMBER("T1"."CHAR_COL")=10

Now, with the transformed version of the SQL statement, it appears that we need a function based index on the CHAR_COL column that converts the column value to a number so that an index access path is possible.  Further down in the 10053 trace we find the following:

-----------------------------
SYSTEM STATISTICS INFORMATION
-----------------------------
  Using NOWORKLOAD Stats
  CPUSPEEDNW: 466 millions instructions/sec (default is 100)
  IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
  IOSEEKTIM: 10 milliseconds (default is 10)
  MBRC: -1 blocks (default is 8 )

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 1000000  #Blks:  16217  AvgRowLen:  108.00
Index Stats::
  Index: SYS_C0010157  Col#: 1
    LVLS: 2  #LB: 3908  #DK: 1000000  LB/K: 1.00  DB/K: 1.00  CLUF: 201245.00
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  Table: T1  Alias: T1
    Card: Original: 1000000.000000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
  Access Path: TableScan
    Cost:  4450.40  Resp: 4450.40  Degree: 0
      Cost_io: 4394.00  Cost_cpu: 315488412
      Resp_io: 4394.00  Resp_cpu: 315488412
  Best:: AccessPath: TableScan
         Cost: 4450.40  Degree: 1  Resp: 4450.40  Card: 1.00  Bytes: 0

***************************************

In the above, we see that the optimizer immediately jumped to a full table scan access path and then immediately declared that a full table scan offered the lowest cost – the optimizer did not even consider an index access path.  Now, let’s compare the above with the SQL statement having a hinted access path:

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT /*+ INDEX ("T1") */ "T1"."CHAR_COL" "CHAR_COL","T1"."C2" "C2" FROM "TESTUSER"."T1" "T1" WHERE TO_NUMBER("T1"."CHAR_COL")=10
...

-----------------------------
SYSTEM STATISTICS INFORMATION
-----------------------------
  Using NOWORKLOAD Stats
  CPUSPEEDNW: 466 millions instructions/sec (default is 100)
  IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
  IOSEEKTIM: 10 milliseconds (default is 10)
  MBRC: -1 blocks (default is 8 )

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 1000000  #Blks:  16217  AvgRowLen:  108.00
Index Stats::
  Index: SYS_C0010157  Col#: 1
    LVLS: 2  #LB: 3908  #DK: 1000000  LB/K: 1.00  DB/K: 1.00  CLUF: 201245.00
    User hint to use this index
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  Table: T1  Alias: T1
    Card: Original: 1000000.000000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
kkofmx: index filter:TO_NUMBER("T1"."CHAR_COL")=10

  Access Path: index (FullScan)
    Index: SYS_C0010157
    resc_io: 3911.00  resc_cpu: 227852122
    ix_sel: 1.000000  ix_sel_with_filters: 0.000001
 ***** Logdef predicate Adjustment ******
 Final IO cst 0.00 , CPU cst 50.00
 ***** End Logdef Adjustment ******
    Cost: 3960.67  Resp: 3960.67  Degree: 1
  Best:: AccessPath: IndexRange
  Index: SYS_C0010157
         Cost: 3960.67  Degree: 1  Resp: 3960.67  Card: 1.00  Bytes: 0

The hint provided in the SQL statement forced the optimizer to do something that is supposedly not possible.  Notice that unlike the previous output, the full table scan was not even considered because of the hint.

The short summary of the above: do things correct from the start to avoid confusion and unexplained performance problems later.

The 10053 trace files (PDF versions of the 10053 trace files):
Oracle Database 11.2.0.1 OR112_LOWEST_COST
Oracle Database 11.1.0.7 OR1117_LOWEST_COST
Oracle Database 10.2.0.4 OR1024_LOWEST_COST

——–

* Late edit May 28, 2010: In the most recent True or False quiz, Centinul provided a documentation reference that states that VARCHAR2 values are always converted to numbers when a number is implicitly compared to a VARCHAR2 value.





True or False – Why Isn’t My Index Getting Used?

25 05 2010

May 25, 2010

(Back to the Previous Post in the Series

The book “Expert One-On-One Oracle” has been on the edge of my desk for the last couple of days.  I cannot recall why I originally pulled it off the bookshelf, but possibly it was needed to satisfy a question about how things used to work in Oracle Database.  I opened the book to a random page, and ended up somewhere around page 303.  Page 303 includes a section title of “Why isn’t my Index Getting Used?”.  This still seems to be a frequent question on various Oracle forums, and volumes have been written on the subject.  A quick Google search finds the following on the topic:

The “Expert One-On-One Oracle” book lists 6 of the cases why an index would not be used.  Some things in Oracle stay the same from one release to the next, while others change.  Which of the following are true or false for Oracle Database 11.1.0.6 or higher, and indicate why the answer is true or false, and if relevant, the Oracle release when the answer changed.  Note that I might intentionally state only part of the case identified in the book, in part to make things more interesting.  Maybe the question of when is it true and when is it not true would be a more realistic request?

Reasons why an index would not be used:

1. The predicate in the WHERE clause does not specify the column at the leading edge of the index.  For example, an index exists on T1(C1, C2) and the WHERE clause in the query is simply WHERE C2=1.

2. The query is attempting to count the number of rows in the table, for example: SELECT COUNT(*) FROM T1;.

3. The query is attempting to apply a function on the indexed columns, for example: SELECT * FROM T1 WHERE TRUNC(DATE_COL) = TRUNC(SYSDATE);.

4. The query is attempting to treat a VARCHAR2/CHAR column that contains only numbers as a numeric column, for example: SELECT * FROM T1 WHERE CHAR_COL = 10;.

5. The index, if used, would be slower, as determined by the calculated cost of the access path.

6. The tables have not been analyzed in a while.  The last time the tables referenced in the WHERE clause were analyzed, the tables were quite small.  Now the tables are significantly larger.  Without up-to-date statistics, the CBO cannot make the correct decisions.

Extra credit: list three other reasons why an index might not be used.

(10053 trace for second test of table T1 in the comments section: OR11_ora_4400_WhyFullTableScanTest – Edit: this is a PDF file, WordPress is not allowing the double extention trick nor Word formatted files for some reason.)





Column Order in a Table – Does it Matter? 2

24 05 2010

May 24, 2010 (Updated May 24, 2010)

(Back to the Previous Post in the Series)

In the previous blog article in this series I created a test case with a table containing 47 columns.  The PL/SQL test precedure experienced 15% to 22% longer execution times when a column with the same definition, just buried 40 columns deeper into the table definition, was referenced by the PL/SQL script.  Could the problem be much worse than 20% – possibly performance problems in the anonymous PL/SQL testing scripts were hiding the true performance impact due to the overhead that did not change from one test to the other (moving the SQL data into a PL/SQL table, for instance).

Today’s blog article provides a demonstration what of happens, performance wise, when very wide tables (having many column) are accessed.  Oracle tables support up to 1,000 columns, however, only 255 columns will fit into a single row piece.  A 1,000 column table, therefore, will use at least 4 row pieces per row inserted into the table.  (I have not yet determined if this is still true if, for instance, the last 800 columns are all NULL, with the rest of the columns sized small enough to fit into a single block).  What performance impact, if any, will appear in the test when accessing a table’s columns that are beyond the first couple of columns?  Will we see the problem identified on page 537 of the Troubleshooting Oracle Performance book?

Creating a test script for a table with 1,000 columns potentially requires a significant amount of typing.  To automate the script writing, an Excel macro compatible (and Visual Basic 6 compatible) script is provided below to help automate the creation of the script:

Dim intFilenum As Integer
Dim strOut As String
Dim i As Integer
Dim j As Integer
Dim k As Integer
Dim intFlag As Integer

intFilenum = FreeFile

Open "C:\ColumnOrder2Script.sql" For Output As #intFilenum

strOut = "CREATE TABLE T10 (" & vbCrLf

For i = 1 To 1000
   strOut = strOut & "  C" & Format(i) & " VARCHAR2(10)," & vbCrLf
Next i

strOut = Left(strOut, Len(strOut) - 3) & ");" & vbCrLf

Print #intFilenum, strOut

Print #intFilenum, "ALTER TABLE T10 CACHE;" & vbCrLf

strOut = "INSERT INTO" & vbCrLf
strOut = strOut & "  T10" & vbCrLf
strOut = strOut & "SELECT" & vbCrLf

For i = 1 To 1000
    intFlag = False
    If i Mod 50 = 0 Then
        intFlag = True
    Else
        Select Case i
            Case 1, 255, 256, 510, 511, 765, 766, 1000
                intFlag = True
        End Select
    End If
    If intFlag = True Then
        strOut = strOut & "  RPAD(CHR(65+MOD(ROWNUM-1,20))||'" & Format(i) & "',10,'A')," & vbCrLf
    Else
        strOut = strOut & "  NULL," & vbCrLf
    End If
Next i

strOut = Left(strOut, Len(strOut) - 3) & vbCrLf
strOut = strOut & "FROM" & vbCrLf
strOut = strOut & "  DUAL" & vbCrLf
strOut = strOut & "CONNECT BY" & vbCrLf
strOut = strOut & "  LEVEL<=1000000;" & vbCrLf

Print #intFilenum, strOut

Print #intFilenum, "COMMIT;" & vbCrLf
Print #intFilenum, "EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T10',CASCADE=>TRUE)"
Print #intFilenum, ""
'Print #intFilenum, "SET AUTOTRACE TRACEONLY STATISTICS"
Print #intFilenum, ""
Print #intFilenum, "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER, LEVEL 1';"
Print #intFilenum, "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'IGNORE';"
Print #intFilenum, "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';" & vbCrLf

For j = 1 To 2
    If j = 2 Then
        Print #intFilenum, "SET TIMING ON"
        Print #intFilenum, "SPOOL COLUMN_ORDER2.TXT"
        Print #intFilenum, "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'COLUMN_ORDER_TEST2';" & vbCrLf
    End If
    For i = 1 To 1000
        intFlag = False
        If i Mod 50 = 0 Then
            intFlag = True
        Else
            Select Case i
                Case 1, 255, 256, 510, 511, 765, 766, 1000
                    intFlag = True
            End Select
        End If
        If intFlag = True Then
            For k = 1 To 10
                If (j = 2) Or (k <= 3) Then
                    'If on the second pass - the one that counts, run SQL 10 times
                    strOut = "SELECT" & vbCrLf
                    strOut = strOut & "  COUNT(C" & Format(i) & ") C" & vbCrLf
                    strOut = strOut & "FROM" & vbCrLf
                    strOut = strOut & "  T10;" & vbCrLf
                    Print #intFilenum, strOut
                End If
            Next k
        End If
    Next i
Next j

Print #intFilenum, "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';"
Print #intFilenum, "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT OFF';"
Print #intFilenum, "SET TIMING OFF"
'Print #intFilenum, "SET AUTOTRACE OFF"

Close #intFilenum

————————————————————–

For those with a Windows client computer that does not have Excel installed, the following script may be saved with a VBS extension and executed directly from Windows to generate the test script.

Dim objFSO
Dim objFile
Dim strOut
Dim i
Dim j
Dim k
Dim intFlag

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.CreateTextFile("C:\ColumnOrder2Script.sql", True)

strOut = "CREATE TABLE T10 (" & vbCrLf

For i = 1 To 1000
   strOut = strOut & "  C" & i & " VARCHAR2(10)," & vbCrLf
Next

strOut = Left(strOut, Len(strOut) - 3) & ");" & vbCrLf

objFile.Write strOut & vbCrLf

objFile.Write "ALTER TABLE T10 CACHE;" & vbCrLf & vbCrLf

strOut = "INSERT INTO" & vbCrLf
strOut = strOut & "  T10" & vbCrLf
strOut = strOut & "SELECT" & vbCrLf

For i = 1 To 1000
    intFlag = False
    If i Mod 50 = 0 Then
        intFlag = True
    Else
        Select Case i
            Case 1, 255, 256, 510, 511, 765, 766, 1000
                intFlag = True
        End Select
    End If
    If intFlag = True Then
        strOut = strOut & "  RPAD(CHR(65+MOD(ROWNUM-1,20))||'" & i & "',10,'A')," & vbCrLf
    Else
        strOut = strOut & "  NULL," & vbCrLf
    End If
Next

strOut = Left(strOut, Len(strOut) - 3) & vbCrLf
strOut = strOut & "FROM" & vbCrLf
strOut = strOut & "  DUAL" & vbCrLf
strOut = strOut & "CONNECT BY" & vbCrLf
strOut = strOut & "  LEVEL<=1000000;" & vbCrLf

objFile.Write strOut & vbCrLf

objFile.Write "COMMIT;" & vbCrLf & vbCrLf
objFile.Write "EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T10',CASCADE=>TRUE)" & vbCrLf
objFile.Write "" & vbCrLf
'objFile.Write "SET AUTOTRACE TRACEONLY STATISTICS" & vbCrLf
objFile.Write "" & vbCrLf
objFile.Write "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER, LEVEL 1';" & vbCrLf
objFile.Write "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'IGNORE';" & vbCrLf
objFile.Write "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';" & vbCrLf & vbCrLf

For j = 1 To 2
    If j = 2 Then
        objFile.Write "SET TIMING ON" & vbCrLf
        objFile.Write "SPOOL COLUMN_ORDER2.TXT" & vbCrLf
        objFile.Write "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'COLUMN_ORDER_TEST2';" & vbCrLf & vbCrLf
    End If
    For i = 1 To 1000
        intFlag = False
        If i Mod 50 = 0 Then
            intFlag = True
        Else
            Select Case i
                Case 1, 255, 256, 510, 511, 765, 766, 1000
                    intFlag = True
            End Select
        End If
        If intFlag = True Then
            For k = 1 To 10
                If (j = 2) Or (k <= 3) Then
                    'If on the second pass - the one that counts, run SQL 10 times
                    strOut = "SELECT" & vbCrLf
                    strOut = strOut & "  COUNT(C" & i & ") C" & vbCrLf
                    strOut = strOut & "FROM" & vbCrLf
                    strOut = strOut & "  T10;" & vbCrLf
                    objFile.Write strOut & vbCrLf
                End If
            Next
        End If
    Next
Next

objFile.Write "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';" & vbCrLf
objFile.Write "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT OFF';" & vbCrLf
objFile.Write "SET TIMING OFF" & vbCrLf
'objFile.Write "SET AUTOTRACE OFF" & vbCrLf

objFile.Close

set objFSO = Nothing
Set objFile = Nothing

ColumnOrder2ScriptCreator.vbs

————————————————————–

The above script might be a little complicated to understand.  What the script does:

  • Create a 1,000 column table with columns defined as VARCHAR2(10) – the columns are named C1, C2, C3, …, C999, C1000.
  • Alters the table to hopefully instruct Oracle to keep the table’s blocks in the buffer cache after those block have been read.
  • Inserts a letter between “A” and “Z” followed by the column position, padded to 10 characters, in every 50th column, as well as the columns that should mark the start column and end column of each row piece (columns 1, 255, 256, 510, 511, 765, 766, 1000).
  • Gathers statistics on the table and any indexes (none in this case).
  • Disables serial direct path read by setting event 10949 (try taking that line out of the script to see how the performance changes).
  • Creates essentially 2 stages of querying the COUNT of every 50th column, in addition to the columns that should represent the end points of each row piece (1, 255, 256, 510, 511, 765, 766, 1000).  The first stage is used to load the blocks into the buffer cache and allow Oracle’s optimizer to settle on the final execution plan (this seems to be necessary on Oracle Database 11.2.0.1), each query is executed three times in this stage.  For the second stage, the trace file identifier (for the 10046 trace) is changed, and each query is listed 10 times to help average the execution time.

Note that the original script included a bug that did not prevent the query output from being suppressed – the SET AUTOTRACE line was incorrectly specified.  I suggest leaving that line commented out in the script generating code, but change the script so that an alias is not assigned to the COUNT() column or assign a unique alias to the column so that the timing that is output on screen may be easily related to the column that is being selected.  The generated script may also be downloaded here: ColumnOrder2Script.sql (strip off the .doc extension and open with a text editor, or run directly with SQL*Plus).

If you add a /*+ Find Me 2 */ comment to each of the SQL statements and change the STATISTICS_LEVEL parameter to ALL, you can use the following to retrieve the actual execution plans and execution statistics:

SET TIMING OFF
SET PAGESIZE 1000
SET LINESIZE 160

SELECT /*+ LEADING(S) */
  T.PLAN_TABLE_OUTPUT
FROM
  (SELECT
    SQL_ID,
    CHILD_NUMBER
  FROM
    V$SQL
  WHERE
    SQL_TEXT LIKE 'SELECT /*+ Find Me 2 */%') S,
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(S.SQL_ID,S.CHILD_NUMBER,'ALLSTATS LAST +COST')) T;

SPOOL OFF

So, what were the results when the script was executed on 64 bit Oracle 11.2.0.1 on the Linux and Windows platforms (with equivalent hardware) using an 8KB block size in an ASSM autoallocate tablespace?

The spool files:

The TKPROF summaries of the 10046 trace files:

In the following summary table, the TKPROF elapsed time totals for the 10 executions of counting each of the non-NULL VARCHAR2(10) columns in the test table is displayed, as well as the elapsed time for that execution divided by the elapsed time for the count of the first column (updated May 24, 2010 to include results from 64 bit Windows running 11.1.0.7 for comparison):

The Factor of C1 column indicates how many times longer the count of that column required compared to the first column.  Counting column 1,000 on Linux required 8.2 times as much time as counting column 1.  On the Windows platform, counting column 1,000 required 7.92 times as much time as counting column C1.  If you review the TKPROF summaries for both platforms you will see an interesting, possibly unexpected, behavior with the way the number of consistent gets increased the further away from the first column our query counted.  The query of column C1 required 1,819,010 consistent gets for 10 executions, while the query of column C1000 required 31,819,010 consistent gets for 10 executions – when you consider that each row required roughly 1300 bytes, that is a significant number of consistent gets for a table whose rows never expanded in size after the initial insert.  The increase in consistent gets is not linear, and neither is the increase in execution time.





Column Order in a Table – Does it Matter? 1

22 05 2010

May 22, 2010 (Updated May 23, 2010)

(Forward to the Next Post in the Series)

This week, between periodic outages, a couple of message threads appeared on the OTN forums that caught my attention.  The first is a thread in which the original poster asked how to add a column between two existing columns in a table – a task that is fairly easy to do in a database such as Microsoft Access, but is not possible without recreating the table in an Oracle database.  When I first started working with Oracle databases, I too wanted to see the columns listed in a specific, logical order in the tables that I created.  The general response in the thread was that the column order did not matter because it is possible to select the table’s columns in whatever order is preferred.

The second thread included a question asking whether column order could potentially affect the performance of a query.  Responders in the thread provided several different answers that boiled down to:

  • Not a chance.
  • Yes with the rule based optimizer, but not with the cost-based optimizer
  • Maybe

In 2008 I read the book Troubleshooting Oracle Performance, and recently started re-reading the book.  In the book I recall reading a couple of paragraphs that suggested that the order of the columns in a table may, in fact, affect the performance of queries.  How?

  • The calculated CPU_COST for a line in an execution plan increases – accessing the first column will have a slightly lower CPU_COST than would accessing the second column in a table.  This change in the calculated CPU_COST could cause a different execution plan to be generated.
  • With variable length columns in a table, Oracle is not able to quickly determine the starting byte location of the tenth column without first determining the byte length of the first nine columns.  While this calculation may not seem significant, it can increase execution time, especially for servers with slower CPUs and higher latency memory.
  • Edit: See the first comment about row chaining, written by Chris Antognini below.

Page 529 of the book Troubleshooting Oracle Performance includes a helpful chart that shows that, for a particular query, accessing the 250th column in a table required roughly five times as long as accessing the first column.  There are of course a lot of variables at play including:

  • The speed of the server.
  • The table design – what type of columns were in the table, what were the column lengths, and how many columns were in the table.
  • The background activity in the server due to the other sessions competing for CPU, memory, and disk access.
  • The number of times the test was repeated.
  • Were all of the blocks in memory, or were physical reads required.

I commented in the second thread, stating that a couple people have reported that their testing found that the position of a column in a table could affect performance, and that is why the optimizer generates a different CPU_COST component for each column in a table.  I did not have an independent test case to verify the results that others obtained, so I posted links to the results that were provided by other people who have tested a column position’s effect on performance.  Was it a bad choice for me to provide an answer without first verifying that it could happen?

I think that we need to put together a test case to see if the column order really matters.  It is a little challenging to put together a test case where the outcome is not affected by spurious network interference, client-side delays, execution statistics calculations, Oracle Database release specific optimizations, and other similar issues.  For example, if you execute a SQL statement twice from a client on an otherwise idle server and see that the number of consistent gets increased on the second execution, that could throw off the accuracy of any intended benchmark test (this behavior was encountered on the 64 bit version of Oracle Database 11.2.0.1).

So, how to test?  We will create a test script that uses PL/SQL to repeatedly execute the same SQL statement, potentially with different WHERE clause predicates, to limit the effects of adaptive cursor sharing and the various other potential problem areas.  The server will run the 64 bit version of Oracle Database 11.2.0.1 on the Linux and Windows platforms with essentially the same hardware to help determine the effects of excessive context switching, and potential optimizations on each operating system platform – there may still be too many variables to state that one operating system platform is more efficient than the other for this specific test script.   The database instance will be configured with the SGA_TARGET set to 8000M, and the FILESYSTEMIO_OPTIONS parameter set to SETALL to enable direct, asychronous I/O (the default on the Windows platform).  As best as possible, this test will seek to factor out delays caused by physical reads, so event 10949 will be set to limit the chances of Oracle switching to physical direct path reads during the portion of the test that performs full table scans – ideally, the entire table and indexes will reside in the buffer cache.

First, we will create a test table with the same set of column definitions in the first couple of columns of the table as are present in the last columns of the table, with several VARCHAR2 columns in between.  Several indexes will be created which may be used for additional tests to determine what happens when the table columns referenced by an index are not stored in numerical/alphabetic order – not all of the indexes will be used for this test.

CREATE TABLE T11 (
  N0 NUMBER,
  N1 NUMBER,
  ND2 NUMBER,
  V3 VARCHAR2(30),
  VD4 VARCHAR2(30),
  D5 DATE,
  DD6 DATE,
  V7 VARCHAR2(50),
  V8 VARCHAR2(50),
  V9 VARCHAR2(50),
  V10 VARCHAR2(50),
  V11 VARCHAR2(50),
  V12 VARCHAR2(50),
  V13 VARCHAR2(50),
  V14 VARCHAR2(50),
  V15 VARCHAR2(50),
  V16 VARCHAR2(50),
  V17 VARCHAR2(50),
  V18 VARCHAR2(50),
  V19 VARCHAR2(50),
  V20 VARCHAR2(50),
  V21 VARCHAR2(50),
  V22 VARCHAR2(50),
  V23 VARCHAR2(50),
  V24 VARCHAR2(50),
  V25 VARCHAR2(50),
  V26 VARCHAR2(50),
  V27 VARCHAR2(50),
  V28 VARCHAR2(50),
  V29 VARCHAR2(50),
  V30 VARCHAR2(50),
  V31 VARCHAR2(50),
  V32 VARCHAR2(50),
  V33 VARCHAR2(50),
  V34 VARCHAR2(50),
  V35 VARCHAR2(50),
  V36 VARCHAR2(50),
  V37 VARCHAR2(50),
  V38 VARCHAR2(50),
  V39 VARCHAR2(50),
  V40 VARCHAR2(50),
  N41 NUMBER,
  ND42 NUMBER,
  V43 VARCHAR2(30),
  VD44 VARCHAR2(30),
  D45 DATE,
  DD46 DATE);

INSERT INTO
  T11
SELECT
  ROWNUM,
  ROWNUM,
  1000000-ROWNUM,
  CHR(MOD(ROWNUM-1,26)+65)||CHR(MOD(ROWNUM,26)+65)||CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65)||CHR(MOD(ROWNUM+3,26)+65)||CHR(MOD(ROWNUM+4,26)+65),
  REVERSE(CHR(MOD(ROWNUM-1,26)+65)||CHR(MOD(ROWNUM,26)+65)||CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65)||CHR(MOD(ROWNUM+3,26)+65)||CHR(MOD(ROWNUM+4,26)+65)),
  TRUNC(SYSDATE) + ROWNUM/1000,
  TRUNC(SYSDATE) + (1000000-ROWNUM)/1000,
  TO_CHAR(ROWNUM),
  TO_CHAR(1000000-ROWNUM),
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  TO_CHAR(ROWNUM),
  TO_CHAR(1000000-ROWNUM),
  ROWNUM,
  1000000-ROWNUM,
  CHR(MOD(ROWNUM-1,26)+65)||CHR(MOD(ROWNUM,26)+65)||CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65)||CHR(MOD(ROWNUM+3,26)+65)||CHR(MOD(ROWNUM+4,26)+65),
  REVERSE(CHR(MOD(ROWNUM-1,26)+65)||CHR(MOD(ROWNUM,26)+65)||CHR(MOD(ROWNUM+1,26)+65)||
    CHR(MOD(ROWNUM+2,26)+65)||CHR(MOD(ROWNUM+3,26)+65)||CHR(MOD(ROWNUM+4,26)+65)),
  TRUNC(SYSDATE) + ROWNUM/1000,
  TRUNC(SYSDATE) + (1000000-ROWNUM)/1000
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000) V1,
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000) V2;

COMMIT;

CREATE INDEX T11_N1 ON T11(N1);
CREATE INDEX T11_N41 ON T11(N41);
CREATE INDEX T11_ND2 ON T11(ND2);
CREATE INDEX T11_ND42 ON T11(ND42);
CREATE INDEX T11_V3 ON T11(V3);
CREATE INDEX T11_VD4 ON T11(VD4);
CREATE INDEX T11_V43 ON T11(V43);
CREATE INDEX T11_VD44 ON T11(VD44);
CREATE INDEX T11_D5 ON T11(D5);
CREATE INDEX T11_DD6 ON T11(DD6);
CREATE INDEX T11_D45 ON T11(D45);
CREATE INDEX T11_DD46 ON T11(DD46);

ALTER TABLE T11 CACHE;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T11',CASCADE=>TRUE)

In the above, notice that a NULL value was assigned to most of the columns located between the two repeating set of column definitions.  1,000,000 rows were inserted into the table.

The testing script that we will use follows.  Essentially, the script is composed of serveral anonymous PL/SQL blocks that compare the execution performance of repeatedly selecting from the first set of columns with the execution performance of repeatedly selecting from the last set of columns.  The first set of tests attempts to test performance using a large number of index range scan accesses, while the second set of tests attempts to test performance using a smaller number of full table scan accesses.

The test script will be repeated 3 times: 1) to flood the buffer cache with the blocks used by the tests (these results will be discarded), 2) with 10046 trace at level 8 enabled, 3) without a 10046 trace enabled.

SET TIMING ON
SET SERVEROUTPUT ON SIZE 1000000

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'COLUMN_POSITION_TEST';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';

SPOOL COLUMN_POSITION_TEST.TXT

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor(intVALUE IN NUMBER) IS
    SELECT /*+ FIND_ME */
      N1,
      ND2
    FROM
      T11
    WHERE
      N1 BETWEEN intVALUE AND INTVALUE+10000;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100000 LOOP
    OPEN cuCursor(I);

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('N1 ND2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor(intVALUE IN NUMBER) IS
    SELECT /*+ FIND_ME */
      N41,
      ND42
    FROM
      T11
    WHERE
      N41 BETWEEN intVALUE AND INTVALUE+10000;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100000 LOOP
    OPEN cuCursor(I);

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('N41 ND42: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor(intVALUE IN NUMBER) IS
    SELECT /*+ FIND_ME */
      V3,
      VD4
    FROM
      T11
    WHERE
      N1 BETWEEN intVALUE AND INTVALUE+10000;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100000 LOOP
    OPEN cuCursor(I);

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('V3 VD4: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor(intVALUE IN NUMBER) IS
    SELECT /*+ FIND_ME */
      V43,
      VD44
    FROM
      T11
    WHERE
      N41 BETWEEN intVALUE AND INTVALUE+10000;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100000 LOOP
    OPEN cuCursor(I);

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('V43 VD44: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor(intVALUE IN NUMBER) IS
    SELECT /*+ FIND_ME */
      D5,
      DD6
    FROM
      T11
    WHERE
      N1 BETWEEN intVALUE AND INTVALUE+10000;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100000 LOOP
    OPEN cuCursor(I);

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('D5 DD6: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor(intVALUE IN NUMBER) IS
    SELECT /*+ FIND_ME */
      D45,
      DD46
    FROM
      T11
    WHERE
      N41 BETWEEN intVALUE AND INTVALUE+10000;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100000 LOOP
    OPEN cuCursor(I);

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('D45 DD46: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER, LEVEL 1';

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor IS
    SELECT /*+ FIND_ME */
      N1,
      ND2
    FROM
      T11;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100 LOOP
    OPEN cuCursor;

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('N1 ND2 2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor IS
    SELECT /*+ FIND_ME */
      N41,
      ND42
    FROM
      T11;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100 LOOP
    OPEN cuCursor;

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('N41 ND42 2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor IS
    SELECT /*+ FIND_ME */
      V3,
      VD4
    FROM
      T11;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100 LOOP
    OPEN cuCursor;

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('V3 VD4 2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor IS
    SELECT /*+ FIND_ME */
      V43,
      VD44
    FROM
      T11;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100 LOOP
    OPEN cuCursor;

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('V43 VD44 2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor IS
    SELECT /*+ FIND_ME */
      D5,
      DD6
    FROM
      T11;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100 LOOP
    OPEN cuCursor;

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('D5 DD6 2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

DECLARE
  tsStart TIMESTAMP := SYSTIMESTAMP;

  CURSOR cuCursor IS
    SELECT /*+ FIND_ME */
      D45,
      DD46
    FROM
      T11;

  TYPE TYPE_cuCursor IS TABLE OF cuCursor%ROWTYPE
    INDEX BY BINARY_INTEGER;

  T_cuCursor TYPE_cuCursor;

BEGIN
  FOR I IN 1..100 LOOP
    OPEN cuCursor;

    LOOP
      FETCH cuCursor BULK COLLECT INTO T_cuCursor LIMIT 1000;
      EXIT WHEN T_cuCursor.COUNT = 0;

      NULL;
    END LOOP;

    CLOSE cuCursor;
  END LOOP;

  DBMS_OUTPUT.PUT_LINE('D45 DD46 2: '||TO_CHAR(SYSTIMESTAMP-tsStart));
END;
/

ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT OFF';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';

SPOOL OFF

What were the test results, does column position in a table matter?  Is performance improved by locating frequently accessed columns in the first couple of column positions?  Does the operating system platform matter (there are a lot of factors contributing to any time differences)?

The script output should look something like this, which is the output from the Linux execution of the script with a 10046 trace enabled (blank spaces between lines were adjusted to group related time values):

N1 ND2: +000000000 00:12:14.750567000
PL/SQL procedure successfully completed.
Elapsed: 00:12:14.77

N41 ND42: +000000000 00:14:13.846775000
PL/SQL procedure successfully completed.
Elapsed: 00:14:13.87

V3 VD4: +000000000 00:12:28.702281000
PL/SQL procedure successfully completed.
Elapsed: 00:12:28.72

V43 VD44: +000000000 00:14:28.663870000
PL/SQL procedure successfully completed.
Elapsed: 00:14:28.68

D5 DD6: +000000000 00:11:58.169894000
PL/SQL procedure successfully completed.
Elapsed: 00:11:58.19

D45 DD46: +000000000 00:13:52.969594000
PL/SQL procedure successfully completed.
Elapsed: 00:13:52.98

Session altered.

N1 ND2 2: +000000000 00:00:41.202708000
PL/SQL procedure successfully completed.
Elapsed: 00:00:41.23

N41 ND42 2: +000000000 00:00:47.222258000
PL/SQL procedure successfully completed.
Elapsed: 00:00:47.22

V3 VD4 2: +000000000 00:00:42.662874000
PL/SQL procedure successfully completed.
Elapsed: 00:00:42.68

V43 VD44 2: +000000000 00:00:50.135338000
PL/SQL procedure successfully completed.
Elapsed: 00:00:50.13

D5 DD6 2: +000000000 00:00:38.307032000
PL/SQL procedure successfully completed.
Elapsed: 00:00:38.31

D45 DD46 2: +000000000 00:00:47.696470000
PL/SQL procedure successfully completed.
Elapsed: 00:00:47.70

In the above, the N1 and ND2 test run shows the performance when the second and third columns, both of which are numbers, were accessed using the index on the N1 column.  The N41 and ND42 test run shows the performance when the 42nd and 43rd columns, both of which are numbers, were accessed using the index on the N41 column.  Thus, the N indicates a numeric column, the V indicates a varchar2 column, and the D indicates a date column, all accessed with either the index on  the N1 column or N41 column.

Below is a summary of the test results when accessing the various columns.

The following code may be used to retrieve the execution plans just to confirm that the expected execution plan was selected, but we have something much better – a 10046 trace file for the execution.

SET TIMING OFF
SET PAGESIZE 1000
SET LINESIZE 160
SPOOL COL_POSITION_PLANS.TXT

SELECT /*+ LEADING(S) */
  T.PLAN_TABLE_OUTPUT
FROM
  (SELECT
    SQL_ID,
    CHILD_NUMBER
  FROM
    V$SQL
  WHERE
    SQL_TEXT LIKE 'SELECT /*+ FIND_ME */%') S,
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(S.SQL_ID,S.CHILD_NUMBER,'TYPICAL')) T;

SPOOL OFF

The TKPROF output for the Linux and Windows executions are attached below.  When downloading, strip off the .doc extension and open the files with a text viewer (Notepad, vi, jed, etc.).

ColumnOrder1TKPROF_LinuxX64.txt
ColumnOrder1TKPROF_WindowsX64.txt

So, how much of an impact does the column order have in your environment?  Oracle databases support up to 1000 columns per table.  How might the performance of selecting data from the second column compare to selecting data from the 902nd column?  How might the outcome change if physical block reads were required?  How might the outcome change if there was significant competition for the CPU cycles in the server?

Feel free to modify and extend the test script.





Improving Performance by Using a Cartesian Join 2

18 05 2010

May 18, 2010

(Back to the Previous Post in the Series)

For a while I have wondered why Cartesian joins might happen when all necessary join conditions are provided in a SQL statement.  Sure, an explain plan will show that a Cartesian join is used when Oracle’s optimizer believes that one of the row sources will return a single row, as seen in this example from a discussion forum:

SELECT
  TBLPERSON.SNAME,
  TBLPERSON.FNAME,
  TO_CHAR(TBLWARDSHIFTS.SHIFTSTART,'HH24:MI') AS REQSTART,
  TO_CHAR(TBLWARDSHIFTS.SHIFTEND,'HH24:MI') AS REQEND,
  TBLREQUIREMENTS.RDATE
FROM
  TBLAVAILABILITY,
  TBLREQUIREMENTS,
  TBLWARDSHIFTS,
  TBLPERSON
WHERE
  TBLAVAILABILITY.ADATE = TBLREQUIREMENTS.RDATE
  AND TBLAVAILABILITY.PERSONID = TBLPERSON.PERSONID
  AND TBLREQUIREMENTS.WARDSHIFTID = TBLWARDSHIFTS.WARDSHIFTID
  AND TO_CHAR(TBLREQUIREMENTS.RDATE,'MM')=TO_CHAR(SYSDATE,'MM')
  AND TO_NUMBER(TO_CHAR(TBLWARDSHIFTS.SHIFTSTART,'HH24'))>=15
  AND TO_NUMBER(TO_CHAR(TBLWARDSHIFTS.SHIFTSTART,'HH24'))<20
  AND TBLAVAILABILITY.ANYLATE=1;

---------------------------------------------------------------------------------------------------------
| Id | Operation                      | Name            | Rows  | Bytes |TempSpc| Cost (%CPU)|Time      |
---------------------------------------------------------------------------------------------------------
|  1 |    NESTED LOOPS                |                 |       |       |       |            |          |
|  2 |     NESTED LOOPS               |                 |   233 | 15611 |       |  1469   (2)| 00:00:18 |
|* 3 |      HASH JOIN                 |                 |   233 | 10951 |       |  1003   (3)| 00:00:13 |
|* 4 |       TABLE ACCESS FULL        | TBLREQUIREMENTS |  5031 | 60372 |       |   980   (3)| 00:00:12 |
|  5 |       MERGE JOIN CARTESIAN     |                 |  3061 |   104K|       |    22   (0)| 00:00:01 |
|* 6 |        TABLE ACCESS FULL       | TBLWARDSHIFTS   |     1 |    20 |       |     3   (0)| 00:00:01 |
|  7 |        BUFFER SORT             |                 |  4639 | 69585 |       |    19   (0)| 00:00:01 |
|* 8 |         TABLE ACCESS FULL      | TBLAVAILABILITY |  4639 | 69585 |       |    19   (0)| 00:00:01 |
|* 9 |      INDEX UNIQUE SCAN         | PK5             |     1 |       |       |     1   (0)| 00:00:01 |
| 10 |     TABLE ACCESS BY INDEX ROWID| TBLPERSON       |     1 |    20 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - access("TBLAVAILABILITY"."ADATE"="TBLREQUIREMENTS"."RDATE" AND
      "TBLREQUIREMENTS"."WARDSHIFTID"="TBLWARDSHIFTS"."WARDSHIFTID")
  4 - filter(TO_CHAR(INTERNAL_FUNCTION("TBLREQUIREMENTS"."RDATE"),'MM')=TO_CHAR(SYSDATE@!,'MM'))
  6 - filter(TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("TBLWARDSHIFTS"."SHIFTSTART"),'HH24'))>=15 AND
      TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("TBLWARDSHIFTS"."SHIFTSTART"),'HH24'))<20)
  8 - filter("TBLAVAILABILITY"."ANYLATE"=1)
  9 - access("TBLAVAILABILITY"."PERSONID"="TBLPERSON"."PERSONID")

We can see that all of the tables are linked:

TBLAVAILABILITY ~ TBLREQUIREMENTS
TBLAVAILABILITY ~ TBLPERSON
TBLREQUIREMENTS ~ TBLWARDSHIFTS

So, what could be leading Oracle to calculate that the predicates applied to TBLWARDSHIFTS will cause only a single row to be returned?  Might it be the functions that are applied to the TBLWARDSHIFTS.SHIFTSTART column and/or the two inequalities applied to that column in the WHERE clause?  What about out of date statistics?  Certainly, a 10053 trace might help solve part of the mystery.  An interesting comment by Tanel Poder in this blog article suggests that a MERGE JOIN CARTESIAN is similar to a NESTED LOOP operation, just without filtering, so maybe Cartesian joins are not all that bad.

The above example shows a case where a Cartesian join may hurt performance, but can it also help performance?  In this blog article comment I provided a test case that included unhinted and hinted (forced) execution plans with and without Cartesian joins.  For example, in one 11.1.0.7 database, the following SQL statement using the tables T1, T2, T3, and T4 from my previous blog article comment:

SELECT
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3) 
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

Produced this execution plan:

Plan hash value: 348785823

---------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  2000K|   427M|       | 24447   (2)| 00:01:39 |
|*  1 |  HASH JOIN             |      |  2000K|   427M|  2488K| 24447   (2)| 00:01:39 |
|*  2 |   HASH JOIN            |      | 20000 |  2246K|       | 10984   (2)| 00:00:45 |
|   3 |    MERGE JOIN CARTESIAN|      |     2 |    12 |       |     5  (20)| 00:00:01 |
|   4 |     SORT UNIQUE        |      |     2 |     6 |       |     2   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL | T4   |     2 |     6 |       |     2   (0)| 00:00:01 |
|   6 |     BUFFER SORT        |      |     2 |     6 |       |            |          |
|   7 |      SORT UNIQUE       |      |     2 |     6 |       |     2   (0)| 00:00:01 |
|   8 |       TABLE ACCESS FULL| T3   |     2 |     6 |       |     2   (0)| 00:00:01 |
|   9 |    TABLE ACCESS FULL   | T2   |  1000K|   103M|       | 10967   (1)| 00:00:45 |
|  10 |   TABLE ACCESS FULL    | T1   |  1000K|   103M|       | 10967   (1)| 00:00:45 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

Interesting, a merge join Cartesian operation between two unrelated tables – the tables in the subqueries that applied to tables T1 and T2.  Another Oracle 11.1.0.7 database produced this execution plan for the same unhinted SQL statement:

Plan hash value: 1754840566

----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes |TempSpc|Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |  4000K|   854M|       |42042   (4)| 00:08:25 |
|*  1 |  HASH JOIN RIGHT SEMI          |           |  4000K|   854M|       |42042   (4)| 00:08:25 |
|   2 |   TABLE ACCESS FULL            | T3        |     2 |     6 |       |    3   (0)| 00:00:01 |
|*  3 |   HASH JOIN                    |           |   200M|    41G|  2424K|41385   (2)| 00:08:17 |
|   4 |    NESTED LOOPS                |           |       |       |       |           |          |
|   5 |     NESTED LOOPS               |           | 20000 |  2187K|       |10026   (1)| 00:02:01 |
|   6 |      SORT UNIQUE               |           |     2 |     6 |       |    3   (0)| 00:00:01 |
|   7 |       TABLE ACCESS FULL        | T4        |     2 |     6 |       |    3   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN          | IND_T2_C2 | 10000 |       |       |   20   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID| T2        | 10000 |  1064K|       |10022   (1)| 00:02:01 |
|  10 |    TABLE ACCESS FULL           | T1        |  1000K|   103M|       |24864   (1)| 00:04:59 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="C1")
   3 - access("T1"."C2"="T2"."C2")
   8 - access("T2"."C2"="C1")

Notice that the MERGE JOIN CARTESIAN operation did not appear this time.  Despite the output of the Time column, the second server is considerably faster than the first.  To obtain the same execution plan on the second server I had to supply the following hint:

/*+ LEADING(T4, T3, T2, T1) USE_HASH(T1) USE_HASH(T2) */

With the above hint added to the SQL statement, the execution plan on the second server looked like this:

Plan hash value: 348785823

--------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |  4000K|   854M|       | 55716   (1)|00:11:09 |
|*  1 |  HASH JOIN             |      |  4000K|   854M|  4968K| 55716   (1)|00:11:09 |
|*  2 |   HASH JOIN            |      | 40000 |  4492K|       | 24874   (1)|00:04:59 |
|   3 |    MERGE JOIN CARTESIAN|      |     4 |    24 |       |     7  (15)|00:00:01 |
|   4 |     SORT UNIQUE        |      |     2 |     6 |       |     3   (0)|00:00:01 |
|   5 |      TABLE ACCESS FULL | T4   |     2 |     6 |       |     3   (0)|00:00:01 |
|   6 |     BUFFER SORT        |      |     2 |     6 |       |     4  (25)|00:00:01 |
|   7 |      SORT UNIQUE       |      |     2 |     6 |       |     3   (0)|00:00:01 |
|   8 |       TABLE ACCESS FULL| T3   |     2 |     6 |       |     3   (0)|00:00:01 |
|   9 |    TABLE ACCESS FULL   | T2   |  1000K|   103M|       | 24864   (1)|00:04:59 |
|  10 |   TABLE ACCESS FULL    | T1   |  1000K|   103M|       | 24864   (1)|00:04:59 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

As you can see, the predicted execution time is longer.  But, how accurate is the time prediction?  To make this a little more interesting, let’s try a couple of more hints on the second server:

/*+ LEADING(T1, T3, T4) */

Execution Plan
----------------------------------------------------------
Plan hash value: 3921125646

--------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |  2020K|   431M|       | 72263   (1)|00:14:28 |
|*  1 |  HASH JOIN            |      |  2020K|   431M|  2512K| 72263   (1)|00:14:28 |
|   2 |   MERGE JOIN CARTESIAN|      | 20202 |  2268K|       | 41547   (1)|00:08:19 |
|*  3 |    HASH JOIN SEMI     |      | 10101 |  1104K|   115M| 30593   (1)|00:06:08 |
|   4 |     TABLE ACCESS FULL | T1   |  1000K|   103M|       | 24864   (1)|00:04:59 |
|   5 |     TABLE ACCESS FULL | T3   |     2 |     6 |       |     3   (0)|00:00:01 |
|   6 |    BUFFER SORT        |      |     2 |     6 |       | 41544   (1)|00:08:19 |
|   7 |     SORT UNIQUE       |      |     2 |     6 |       |     1   (0)|00:00:01 |
|   8 |      TABLE ACCESS FULL| T4   |     2 |     6 |       |     1   (0)|00:00:01 |
|   9 |   TABLE ACCESS FULL   | T2   |  1000K|   103M|       | 24864   (1)|00:04:59 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T2"."C2"="C1")
   3 - access("T1"."C2"="C1")

/*+ LEADING(T1, T3, T2) */

Plan hash value: 1368374064

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  2020K|   431M|       | 56120   (2)|00:11:14 |
|*  1 |  HASH JOIN RIGHT SEMI|      |  2020K|   431M|       | 56120   (2)|00:11:14 |
|   2 |   TABLE ACCESS FULL  | T4   |     2 |     6 |       |     3   (0)|00:00:01 |
|*  3 |   HASH JOIN          |      |   101M|    20G|       | 55787   (1)|00:11:10 |
|*  4 |    HASH JOIN SEMI    |      | 10101 |  1104K|   115M| 30593   (1)|00:06:08 |
|   5 |     TABLE ACCESS FULL| T1   |  1000K|   103M|       | 24864   (1)|00:04:59 |
|   6 |     TABLE ACCESS FULL| T3   |     2 |     6 |       |     3   (0)|00:00:01 |
|   7 |    TABLE ACCESS FULL | T2   |  1000K|   103M|       | 24864   (1)|00:04:59 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."C2"="C1")
   3 - access("T1"."C2"="T2"."C2")
   4 - access("T1"."C2"="C1")

Note that there was no Cartesian join in the above execution plan – that was the first hinted execution plan on the first server that did not yield a Cartesian join.

/*+ LEADING(T3, T4) */

Execution Plan
----------------------------------------------------------
Plan hash value: 2581882832

--------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes |TempSpc|Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           |  2000K|   427M|       |50767   (1)| 00:10:10 |
|*  1 |  HASH JOIN                    |           |  2000K|   427M|  2488K|50767   (1)| 00:10:10 |
|   2 |   NESTED LOOPS                |           |       |       |       |           |          |
|   3 |    NESTED LOOPS               |           | 20000 |  2246K|       |20052   (1)| 00:04:01 |
|   4 |     MERGE JOIN CARTESIAN      |           |     2 |    12 |       |    7  (15)| 00:00:01 |
|   5 |      SORT UNIQUE              |           |     2 |     6 |       |    3   (0)| 00:00:01 |
|   6 |       TABLE ACCESS FULL       | T3        |     2 |     6 |       |    3   (0)| 00:00:01 |
|   7 |      BUFFER SORT              |           |     2 |     6 |       |           |          |
|   8 |       SORT UNIQUE             |           |     2 |     6 |       |    3   (0)| 00:00:01 |
|   9 |        TABLE ACCESS FULL      | T4        |     2 |     6 |       |    3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | IND_T2_C2 | 10000 |       |       |   20   (0)| 00:00:01 |
|  11 |    TABLE ACCESS BY INDEX ROWID| T2        | 10000 |  1064K|       |10022   (1)| 00:02:01 |
|  12 |   TABLE ACCESS FULL           | T1        |  1000K|   103M|       |24864   (1)| 00:04:59 |
--------------------------------------------------------------------------------------------------

Unfortunately, for the last execution plan I did not capture the predicate information section.

We need a script to determine which plan is fastest – this script will be run directly on the second server so that we eliminate the effects of network traffic.

SET AUTOTRACE TRACEONLY STATISTICS
SET ARRAYSIZE 1000
ALTER SESSION SET STATISTICS_LEVEL='ALL';
SET TIMING ON

SPOOL SMALLEST_TEST.TXT

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T1, T3, T4) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T1, T3, T2) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T3, T4) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT /*+ LEADING(T4, T3, T2, T1) USE_HASH(T1) USE_HASH(T2) */
  T1.C1,
  T2.C1,
  T1.C3,1,10 T1_C3,
  T2.C3 T2_C3
FROM
  T1,
  T2
WHERE
  T1.C2=T2.C2
  AND T1.C2 IN (
    SELECT
      C1
    FROM
      T3)
  AND T2.C2 IN (
    SELECT
      C1
    FROM
      T4);

SPOOL OFF

So, what was the output of the script?

SQL> SELECT
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:09:15.90

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     129202  consistent gets
     101906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

---

SQL> SELECT /*+ LEADING(T1, T3, T4) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:09:59.15

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     200126  consistent gets
     181906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

---

SQL> SELECT /*+ LEADING(T1, T3, T2) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:10:18.07

---

SQL> SELECT /*+ LEADING(T3, T4) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:09:51.72

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     149245  consistent gets
     101906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

---

SQL> SELECT /*+ LEADING(T4, T3, T2, T1) USE_HASH(T1) USE_HASH(T2) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 00:08:47.07

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     200126  consistent gets
     181906  physical reads
          0  redo size
 3032560885  bytes sent via SQL*Net to client
    2200489  bytes received via SQL*Net from client
     200001  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

The result?  The heavily hinted execution plan that reproduced the execution plan containing the Cartesian join found on the first server completed roughly 29 seconds faster than the unhinted execution plan that did not use a Cartesian join.  Not all Cartesian joins are bad.

You are probably wondering what the output from the first server looked like.  I think that the best answer is that the results are “in-doubt”.  The first server was queried over the network with the default array fetch size of 15.  The results follow (note that the execution plans may be different than those found on the second server):

SQL> SELECT /*+ LEADING(T1, T3, T4) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 01:02:34.25

Statistics
----------------------------------------------------------
        114  recursive calls
          0  db block gets
     200110  consistent gets
     189480  physical reads
          0  redo size
SP2-0642: SQL*Plus internal error state 1075, context 1:5:4294967295
Unsafe to proceed
  146667044  bytes received via SQL*Net from client
   13333335  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

That output does not look good.  Should we continue?

SQL> SELECT /*+ LEADING(T1, T3, T2) */
  2    T1.C1,
  3    T2.C1,
  4    T1.C3,1,10 T1_C3,
  5    T2.C3 T2_C3
  6  FROM
  7    T1,
  8    T2
  9  WHERE
 10    T1.C2=T2.C2
 11    AND T1.C2 IN (
 12      SELECT
 13        C1
 14      FROM
 15        T3)
 16    AND T2.C2 IN (
 17      SELECT
 18        C1
 19      FROM
 20        T4);

200000000 rows selected.

Elapsed: 01:04:18.54

Statistics
----------------------------------------------------------
        115  recursive calls
          0  db block gets
     200110  consistent gets
     189480  physical reads
          0  redo size
SP2-0642: SQL*Plus internal error state 1075, context 1:5:4294967295
Unsafe to proceed
  146667044  bytes received via SQL*Net from client
   13333335  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
  200000000  rows processed

Unsafe to proceed?  OK, enough pain.  Let’s check the execution plans from the first server:

SET AUTOTRACE OFF
SET PAGESIZE 1000

SET LINESIZE 160

SELECT /*+ LEADING(S) */
  T.PLAN_TABLE_OUTPUT
FROM
  (SELECT
    SQL_ID,
    CHILD_NUMBER
  FROM
    V$SQL
  WHERE
    SQL_TEXT LIKE 'SELECT%'
    AND SQL_TEXT LIKE '%T1.C2=T2.C2%'
    AND SQL_TEXT LIKE '%T4)') S,
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(S.SQL_ID,S.CHILD_NUMBER,'ALLSTATS LAST +COST')) T;

The output (note that the STATISTICS_LEVEL was changed from TYPICAL to ALL when the queries were actually executed, hence the multiple child cursors for some of the SQL statements, and the warning that appears in the Note section for some of the execution plans):

SQL_ID  b0pk35tnkh80n, child number 0
-------------------------------------
SELECT /*+ LEADING(T1, T3, T2) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
  T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 1368374064

--------------------------------------------------------------------------------------
| Id  | Operation            | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |        | 29186 (100)|       |       |          |
|*  1 |  HASH JOIN RIGHT SEMI|      |   2020K| 29186   (9)|   935K|   935K|          |
|   2 |   TABLE ACCESS FULL  | T4   |      2 |     2   (0)|       |       |          |
|*  3 |   HASH JOIN          |      |    101M| 27997   (6)|  1949K|   945K|          |
|*  4 |    HASH JOIN SEMI    |      |  10101 | 13397   (1)|   133M|  7573K|          |
|   5 |     TABLE ACCESS FULL| T1   |   1000K| 10967   (1)|       |       |          |
|   6 |     TABLE ACCESS FULL| T3   |      2 |     2   (0)|       |       |          |
|   7 |    TABLE ACCESS FULL | T2   |   1000K| 10967   (1)|       |       |          |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."C2"="C1")
   3 - access("T1"."C2"="T2"."C2")
   4 - access("T1"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID  bkdkvvppuhta3, child number 0
-------------------------------------
SELECT   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,   T2.C3 T2_C3 FROM   T1,
T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (     SELECT       C1     FROM
   T3)   AND T2.C2 IN (     SELECT       C1     FROM       T4)

Plan hash value: 348785823

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |        | 24447 (100)|       |       |          |
|*  1 |  HASH JOIN             |      |   2000K| 24447   (2)|  3361K|   944K|          |
|*  2 |   HASH JOIN            |      |  20000 | 10984   (2)|   878K|   878K|          |
|   3 |    MERGE JOIN CARTESIAN|      |      2 |     5  (20)|       |       |          |
|   4 |     SORT UNIQUE        |      |      2 |     2   (0)| 73728 | 73728 |          |
|   5 |      TABLE ACCESS FULL | T4   |      2 |     2   (0)|       |       |          |
|   6 |     BUFFER SORT        |      |      2 |            | 73728 | 73728 |          |
|   7 |      SORT UNIQUE       |      |      2 |     2   (0)| 73728 | 73728 |          |
|   8 |       TABLE ACCESS FULL| T3   |      2 |     2   (0)|       |       |          |
|   9 |    TABLE ACCESS FULL   | T2   |   1000K| 10967   (1)|       |       |          |
|  10 |   TABLE ACCESS FULL    | T1   |   1000K| 10967   (1)|       |       |          |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID  bkdkvvppuhta3, child number 1
-------------------------------------
SELECT   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,   T2.C3 T2_C3 FROM   T1,
T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (     SELECT       C1     FROM
   T3)   AND T2.C2 IN (     SELECT       C1     FROM       T4)

Plan hash value: 348785823

------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        | 24447 (100)|    200M|00:00:17.51 |     200K|    181K|       |       |          |
|*  1 |  HASH JOIN             |      |      1 |   2000K| 24447   (2)|    200M|00:00:17.51 |     200K|    181K|  6223K|  1897K| 6684K (0)|
|*  2 |   HASH JOIN            |      |      1 |  20000 | 10984   (2)|  40000 |00:00:15.50 |   90967 |  90960 |   968K|   968K|  464K (0)|
|   3 |    MERGE JOIN CARTESIAN|      |      1 |      2 |     5  (20)|      4 |00:00:00.02 |       6 |      4 |       |       |          |
|   4 |     SORT UNIQUE        |      |      1 |      2 |     2   (0)|      2 |00:00:00.02 |       3 |      2 |  2048 |  2048 | 2048  (0)|
|   5 |      TABLE ACCESS FULL | T4   |      1 |      2 |     2   (0)|      2 |00:00:00.02 |       3 |      2 |       |       |          |
|   6 |     BUFFER SORT        |      |      2 |      2 |            |      4 |00:00:00.01 |       3 |      2 |  2048 |  2048 | 2048  (0)|
|   7 |      SORT UNIQUE       |      |      1 |      2 |     2   (0)|      2 |00:00:00.01 |       3 |      2 |  2048 |  2048 | 2048  (0)|
|   8 |       TABLE ACCESS FULL| T3   |      1 |      2 |     2   (0)|      2 |00:00:00.01 |       3 |      2 |       |       |          |
|   9 |    TABLE ACCESS FULL   | T2   |      1 |   1000K| 10967   (1)|   1000K|00:00:15.02 |   90961 |  90956 |       |       |          |
|  10 |   TABLE ACCESS FULL    | T1   |      1 |   1000K| 10967   (1)|   1000K|00:00:01.02 |     109K|  90956 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

SQL_ID  2drtx01hd21jb, child number 0
-------------------------------------
SELECT /*+ LEADING(T1, T3, T4) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
  T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 3921125646

---------------------------------------------------------------------------------------
| Id  | Operation             | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |        | 31686 (100)|       |       |          |
|*  1 |  HASH JOIN            |      |   2020K| 31686   (1)|  3420K|   944K|          |
|   2 |   MERGE JOIN CARTESIAN|      |  20202 | 18221   (1)|       |       |          |
|*  3 |    HASH JOIN SEMI     |      |  10101 | 13397   (1)|   133M|  7573K|          |
|   4 |     TABLE ACCESS FULL | T1   |   1000K| 10967   (1)|       |       |          |
|   5 |     TABLE ACCESS FULL | T3   |      2 |     2   (0)|       |       |          |
|   6 |    BUFFER SORT        |      |      2 | 18219   (1)| 73728 | 73728 |          |
|   7 |     SORT UNIQUE       |      |      2 |     0   (0)| 73728 | 73728 |          |
|   8 |      TABLE ACCESS FULL| T4   |      2 |     0   (0)|       |       |          |
|   9 |   TABLE ACCESS FULL   | T2   |   1000K| 10967   (1)|       |       |          |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T2"."C2"="C1")
   3 - access("T1"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

SQL_ID  2drtx01hd21jb, child number 1
-------------------------------------
SELECT /*+ LEADING(T1, T3, T4) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
  T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 3921125646

------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        | 31686 (100)|    200M|00:00:29.65 |     200K|    189K|  13702 |       |       |          |         |
|*  1 |  HASH JOIN            |      |      1 |   2020K| 31686   (1)|    200M|00:00:29.65 |     200K|    189K|  13702 |  6223K|  1897K| 6659K (0)|         |
|   2 |   MERGE JOIN CARTESIAN|      |      1 |  20202 | 18221   (1)|  40000 |00:00:27.35 |   90967 |  98524 |  13702 |       |       |          |         |
|*  3 |    HASH JOIN SEMI     |      |      1 |  10101 | 13397   (1)|  20000 |00:00:27.31 |   90964 |  98522 |  13702 |   126M|  7603K|   83  (1)|     114K|
|   4 |     TABLE ACCESS FULL | T1   |      1 |   1000K| 10967   (1)|   1000K|00:00:17.00 |   90961 |  90956 |      0 |       |       |          |         |
|   5 |     TABLE ACCESS FULL | T3   |      1 |      2 |     2   (0)|      2 |00:00:00.16 |       3 |      2 |      0 |       |       |          |         |
|   6 |    BUFFER SORT        |      |  20000 |      2 | 18219   (1)|  40000 |00:00:00.04 |       3 |      2 |      0 |  2048 |  2048 | 2048  (0)|         |
|   7 |     SORT UNIQUE       |      |      1 |      2 |     0   (0)|      2 |00:00:00.04 |       3 |      2 |      0 |  2048 |  2048 | 2048  (0)|         |
|   8 |      TABLE ACCESS FULL| T4   |      1 |      2 |     0   (0)|      2 |00:00:00.04 |       3 |      2 |      0 |       |       |          |         |
|   9 |   TABLE ACCESS FULL   | T2   |      1 |   1000K| 10967   (1)|   1000K|00:00:01.02 |     109K|  90956 |      0 |       |       |          |         |
------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T2"."C2"="C1")
   3 - access("T1"."C2"="C1")

SQL_ID  91wazrzfau2wq, child number 0
-------------------------------------
SELECT /*+ LEADING(T3, T4) */   T1.C1,   T2.C1,   T1.C3,1,10 T1_C3,
T2.C3 T2_C3 FROM   T1,   T2 WHERE   T1.C2=T2.C2   AND T1.C2 IN (
SELECT       C1     FROM       T3)   AND T2.C2 IN (     SELECT       C1
    FROM       T4)

Plan hash value: 331368001

----------------------------------------------------------------------------------------
| Id  | Operation              | Name | E-Rows | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |        | 24447 (100)|       |       |          |
|*  1 |  HASH JOIN             |      |   2000K| 24447   (2)|  3361K|   944K|          |
|*  2 |   HASH JOIN            |      |  20000 | 10984   (2)|   878K|   878K|          |
|   3 |    MERGE JOIN CARTESIAN|      |      2 |     5  (20)|       |       |          |
|   4 |     SORT UNIQUE        |      |      2 |     2   (0)| 73728 | 73728 |          |
|   5 |      TABLE ACCESS FULL | T3   |      2 |     2   (0)|       |       |          |
|   6 |     BUFFER SORT        |      |      2 |            | 73728 | 73728 |          |
|   7 |      SORT UNIQUE       |      |      2 |     2   (0)| 73728 | 73728 |          |
|   8 |       TABLE ACCESS FULL| T4   |      2 |     2   (0)|       |       |          |
|   9 |    TABLE ACCESS FULL   | T2   |   1000K| 10967   (1)|       |       |          |
|  10 |   TABLE ACCESS FULL    | T1   |   1000K| 10967   (1)|       |       |          |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."C2"="T2"."C2" AND "T1"."C2"="C1")
   2 - access("T2"."C2"="C1")

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level

I will repeat – not all Cartesian joins are bad for performance.  But, of course, it is not always obvious which Cartesian joins are good, which are bad, and which are just OK without testing.





Network Monitoring Experimentations 6

14 05 2010

May 14, 2010

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

The last Network Monitoring Experimentations article encountered an unexpected snag – a SQL*Net optimization decreased the average number of bytes needed to transmit each row from about 889 bytes to roughly 86 bytes.  While the SQL*Net optimization is great from a performance standpoint (see the summary at the end of this article), it caused problems for some of the intended analysis in the previous article.  So, for a proper comparison, we will slightly change the inserted data when creating table T6 so that column C2 stores unique values (two random letters followed by 98 letter A characters):

CREATE TABLE T6 (
  C1 VARCHAR2(10),
  C2 VARCHAR2(100),
  PRIMARY KEY (C1));

INSERT INTO T6 NOLOGGING
SELECT
  'A'||TO_CHAR(ROWNUM,'0000000'),
  RPAD(DBMS_RANDOM.STRING('A',2),100,'A')
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T6',CASCADE=>TRUE)

We will take the script used in the previous Network Monitoring blog article and modify it to point to the new table T6, and then append the original script onto the end of the modified script.  The modified portion of the script that accesses table T6 looks like this:

SET TIMING ON

SET AUTOTRACE TRACEONLY STATISTICS

SET ARRAYSIZE 10

SELECT /*+ USE_NL(T61 T62 T63 T64 T65 T66 T67 T68) */
  T61.C1,
  T61.C2,
  T62.C1,
  T62.C2,
  T63.C1,
  T63.C2,
  T64.C1,
  T64.C2,
  T65.C1,
  T65.C2,
  T66.C1,
  T66.C2,
  T67.C1,
  T67.C2,
  T68.C1,
  T68.C2
FROM
  T6 T61,
  T6 T62,
  T6 T63,
  T6 T64,
  T6 T65,
  T6 T66,
  T6 T67,
  T6 T68
WHERE
  T61.C1=T62.C1
  AND T61.C1=T63.C1
  AND T61.C1=T64.C1
  AND T61.C1=T65.C1
  AND T61.C1=T66.C1
  AND T61.C1=T67.C1
  AND T61.C1=T68.C1
  AND T61.C1 BETWEEN 'A 0001000' AND 'A 1000000';

host sleep 5

SET ARRAYSIZE 50

SELECT /*+ USE_NL(T61 T62 T63 T64 T65 T66 T67 T68) */
  T61.C1,
  T61.C2,
  T62.C1,
  T62.C2,
  T63.C1,
  T63.C2,
  T64.C1,
  T64.C2,
  T65.C1,
  T65.C2,
  T66.C1,
  T66.C2,
  T67.C1,
  T67.C2,
  T68.C1,
  T68.C2
FROM
  T6 T61,
  T6 T62,
  T6 T63,
  T6 T64,
  T6 T65,
  T6 T66,
  T6 T67,
  T6 T68
WHERE
  T61.C1=T62.C1
  AND T61.C1=T63.C1
  AND T61.C1=T64.C1
  AND T61.C1=T65.C1
  AND T61.C1=T66.C1
  AND T61.C1=T67.C1
  AND T61.C1=T68.C1
  AND T61.C1 BETWEEN 'A 0001000' AND 'A 1000000';

host sleep 5

SET ARRAYSIZE 100

SELECT /*+ USE_NL(T61 T62 T63 T64 T65 T66 T67 T68) */
  T61.C1,
  T61.C2,
  T62.C1,
  T62.C2,
  T63.C1,
  T63.C2,
  T64.C1,
  T64.C2,
  T65.C1,
  T65.C2,
  T66.C1,
  T66.C2,
  T67.C1,
  T67.C2,
  T68.C1,
  T68.C2
FROM
  T6 T61,
  T6 T62,
  T6 T63,
  T6 T64,
  T6 T65,
  T6 T66,
  T6 T67,
  T6 T68
WHERE
  T61.C1=T62.C1
  AND T61.C1=T63.C1
  AND T61.C1=T64.C1
  AND T61.C1=T65.C1
  AND T61.C1=T66.C1
  AND T61.C1=T67.C1
  AND T61.C1=T68.C1
  AND T61.C1 BETWEEN 'A 0001000' AND 'A 1000000';

host sleep 5

SET ARRAYSIZE 500

SELECT /*+ USE_NL(T61 T62 T63 T64 T65 T66 T67 T68) */
  T61.C1,
  T61.C2,
  T62.C1,
  T62.C2,
  T63.C1,
  T63.C2,
  T64.C1,
  T64.C2,
  T65.C1,
  T65.C2,
  T66.C1,
  T66.C2,
  T67.C1,
  T67.C2,
  T68.C1,
  T68.C2
FROM
  T6 T61,
  T6 T62,
  T6 T63,
  T6 T64,
  T6 T65,
  T6 T66,
  T6 T67,
  T6 T68
WHERE
  T61.C1=T62.C1
  AND T61.C1=T63.C1
  AND T61.C1=T64.C1
  AND T61.C1=T65.C1
  AND T61.C1=T66.C1
  AND T61.C1=T67.C1
  AND T61.C1=T68.C1
  AND T61.C1 BETWEEN 'A 0001000' AND 'A 1000000';

host sleep 5

SET ARRAYSIZE 1000

SELECT /*+ USE_NL(T61 T62 T63 T64 T65 T66 T67 T68) */
  T61.C1,
  T61.C2,
  T62.C1,
  T62.C2,
  T63.C1,
  T63.C2,
  T64.C1,
  T64.C2,
  T65.C1,
  T65.C2,
  T66.C1,
  T66.C2,
  T67.C1,
  T67.C2,
  T68.C1,
  T68.C2
FROM
  T6 T61,
  T6 T62,
  T6 T63,
  T6 T64,
  T6 T65,
  T6 T66,
  T6 T67,
  T6 T68
WHERE
  T61.C1=T62.C1
  AND T61.C1=T63.C1
  AND T61.C1=T64.C1
  AND T61.C1=T65.C1
  AND T61.C1=T66.C1
  AND T61.C1=T67.C1
  AND T61.C1=T68.C1
  AND T61.C1 BETWEEN 'A 0001000' AND 'A 1000000';

host sleep 5

SET ARRAYSIZE 5000

SELECT /*+ USE_NL(T61 T62 T63 T64 T65 T66 T67 T68) */
  T61.C1,
  T61.C2,
  T62.C1,
  T62.C2,
  T63.C1,
  T63.C2,
  T64.C1,
  T64.C2,
  T65.C1,
  T65.C2,
  T66.C1,
  T66.C2,
  T67.C1,
  T67.C2,
  T68.C1,
  T68.C2
FROM
  T6 T61,
  T6 T62,
  T6 T63,
  T6 T64,
  T6 T65,
  T6 T66,
  T6 T67,
  T6 T68
WHERE
  T61.C1=T62.C1
  AND T61.C1=T63.C1
  AND T61.C1=T64.C1
  AND T61.C1=T65.C1
  AND T61.C1=T66.C1
  AND T61.C1=T67.C1
  AND T61.C1=T68.C1
  AND T61.C1 BETWEEN 'A 0001000' AND 'A 1000000';

Just to keep things interesting, the laptop used in the previous article is now connected through an 802.11N wireless connection (gigabit uplink to the server), rather than an 802.11G wireless connection.  So, what is the output of the new combined script?  Once again, the server is running Oracle Database 11.2.0.1 and the client is a dual core 2.0GHz Core 2 Duo laptop (the client CPU may be a critical factor).

Table T6:

SQL> SET ARRAYSIZE 10

Elapsed: 00:04:49.97

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  896904186  bytes sent via SQL*Net to client                                  
    1099259  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50

Elapsed: 00:02:48.73

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35563  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  889871426  bytes sent via SQL*Net to client                                  
     220139  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:02:43.94

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888992556  bytes sent via SQL*Net to client                                  
     110249  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:02:34.81

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888291260  bytes sent via SQL*Net to client                                  
      22337  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:02:33.64

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888205848  bytes sent via SQL*Net to client                                  
      11348  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:02:32.59

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888153854  bytes sent via SQL*Net to client                                  
       2548  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

At an array fetch size of 10 the client is receiving about 855.35MB, and at an array fetch size of 5,000 the client is receiving about 847.01MB.

Let’s contrast the above with the test for table T5.

Table T5:

SQL> SET ARRAYSIZE 10

Elapsed: 00:02:01.55

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   94508146  bytes sent via SQL*Net to client                                  
    1099259  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50

Elapsed: 00:00:56.37

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35564  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   87635026  bytes sent via SQL*Net to client                                  
     220139  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:00:40.77

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86775886  bytes sent via SQL*Net to client                                  
     110249  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:00:27.31

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86088574  bytes sent via SQL*Net to client                                  
      22337  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:00:27.03

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86002660  bytes sent via SQL*Net to client                                  
      11348  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:00:24.64

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   85933880  bytes sent via SQL*Net to client                                  
       2548  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

At an array fetch size of 10, for table T5 the client is receiving about 90.13MB (855.35MB for table T6), and at an array fetch size of 5,000 the client is receiving about 81.95MB (847.01MB for table T6).  Incidentally, the client’s CPU utilization fluctuated between 50% and 99% utilization, indicating that the client’s dual core CPU may have played a limiting factor in the transfer speed.

—-

You must be wondering what kind of an impact the 802.11N wireless had on the overall transfer time.  During several of the transfers the Windows Vista Resource Meter indicated that the transfer speed was between 45Mb/s and 48Mb/s with the maximum network transfer capacity indicated as 33% utilized (this upper limit of maximum network transfer capacity did not change when connected to an 802.11G network).

The following execution statistics show the output when the same laptop was connected to the network on a 100Mb/s connection (the maximum supported by the laptop):

Table T6:

SQL> SET ARRAYSIZE 10

Elapsed: 00:02:09.54

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  896904186  bytes sent via SQL*Net to client                                  
    1099260  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50

Elapsed: 00:01:33.00

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35563  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  889871426  bytes sent via SQL*Net to client                                  
     220140  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:01:29.23

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888992556  bytes sent via SQL*Net to client                                  
     110250  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:01:25.08

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888291260  bytes sent via SQL*Net to client                                  
      22338  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:01:24.84

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888205848  bytes sent via SQL*Net to client                                  
      11349  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:01:24.83

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888153854  bytes sent via SQL*Net to client                                  
       2549  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

Table T5:

SQL> SET ARRAYSIZE 10

Elapsed: 00:00:56.28

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   94508146  bytes sent via SQL*Net to client                                  
    1099260  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50

Elapsed: 00:00:28.36

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35564  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   87635026  bytes sent via SQL*Net to client                                  
     220140  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:00:24.19

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86775886  bytes sent via SQL*Net to client                                  
     110250  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:00:18.37

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86088574  bytes sent via SQL*Net to client                                  
      22338  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:00:17.95

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86002660  bytes sent via SQL*Net to client                                  
      11349  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:00:17.67

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   85933880  bytes sent via SQL*Net to client                                  
       2549  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

The above output indicates that the lower latency 100Mb/s wired connection did decrease the transfer time so that the transfer of result set from table T6 (no SQL*Net compression) – the array fetch setting of 10 required roughly the same amount of time as the transfer of the result set from table T5 (with SQL*Net compression) when the laptop was connected wirelessly.  The CPU utilization of the client also dropped slightly – possibly because the laptop no longer needed to perform the WPA2 encryption/decryption of the network packets (the wireless network is protected by WPA2 encryption). 

—-

But what about a gigabit connection – would that help?  Unfortunately, the laptop does not support gigabit network speeds.  With the wired connection topping out at about 83Mb/s to 87Mb/s and the CPU utilization running as high as it was, could we actually achieve a faster transfer speed?  To find out, we will switch to a slower desktop computer that is running a 3.8GHz Pentium 4 CPU with hyperthreading enabled.  This client, like the server, has a gigabit network card and probably supports CPU offloading of the TCP/IP checksums.  So, how well will this second client perform with our test script?

Table T6:

SQL> SET ARRAYSIZE 10

Elapsed: 00:00:56.29

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  896904186  bytes sent via SQL*Net to client                                  
    1099260  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50 

Elapsed: 00:00:30.12

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35563  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  889871426  bytes sent via SQL*Net to client                                  
     220140  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:00:27.75

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888992556  bytes sent via SQL*Net to client                                  
     110250  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:00:26.21

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888291260  bytes sent via SQL*Net to client                                  
      22338  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:00:25.87

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888205848  bytes sent via SQL*Net to client                                  
      11349  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:00:26.71

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888153854  bytes sent via SQL*Net to client                                  
       2549  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

The above showed that the gigabit connection helped significantly.  The gigabit performance results for table T5 follow.

Table T5:

SQL> SET ARRAYSIZE 10

Elapsed: 00:00:47.17

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   94508146  bytes sent via SQL*Net to client                                  
    1099260  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---

SQL> SET ARRAYSIZE 50

Elapsed: 00:00:19.45

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35564  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   87635026  bytes sent via SQL*Net to client                                  
     220140  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:00:16.28

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86775886  bytes sent via SQL*Net to client                                  
     110250  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:00:12.98

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86088574  bytes sent via SQL*Net to client                                  
      22338  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:00:12.68

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86002660  bytes sent via SQL*Net to client                                  
      11349  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:00:13.29

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   85933880  bytes sent via SQL*Net to client                                  
       2549  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

The above shows that the older client computer with the gigabit network card continued to decrease the transfer time up to the array fetch size of 1,000, and then the time increased slightly with the jump to the array fetch size of 5,000.

Just how fast was the transfer speed on the slower client computer?  I monitored the performance using a program called NetPerSec which was developed by PC Magazine several years ago.  Keep an eye on the right side of the graphs for the current transfer speeds.

ARRAYSIZE 10 – roughly 135.3Mb/s receive, 3.4Mb/s send:

ARRAYSIZE 100 – roughly 275.0Mb/s receive, 5.8Mb/s send:

ARRAYSIZE 500 – roughly 295.6Mb/s receive, 5.8Mb/s send:

ARRAYSIZE 1000 – roughly 295.9Mb/s receive, 5.8Mb/s send:

ARRAYSIZE 5000 – roughly 294.9Mb/s receive, 5.7Mb/s send:

Will it make any difference if we completely eliminate the network – running the SQL statement directly on the server?

Table T6:

SQL> SET ARRAYSIZE 10

Elapsed: 00:00:21.73

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  906694810  bytes sent via SQL*Net to client                                  
    1099420  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50

Elapsed: 00:00:14.92

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35563  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  891829890  bytes sent via SQL*Net to client                                  
     220300  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:00:14.25

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  889972000  bytes sent via SQL*Net to client                                  
     110410  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:00:13.47

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888487488  bytes sent via SQL*Net to client                                  
      22498  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:00:13.43

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888304174  bytes sent via SQL*Net to client                                  
      11509  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:00:13.46

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  888173808  bytes sent via SQL*Net to client                                  
       2709  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

Table T5:

SQL> SET ARRAYSIZE 10

Elapsed: 00:00:15.60

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
     112906  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
  104198871  bytes sent via SQL*Net to client                                  
    1099420  bytes received via SQL*Net from client                            
      99902  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 50

Elapsed: 00:00:09.40

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      35564  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   89573511  bytes sent via SQL*Net to client                                  
     220300  bytes received via SQL*Net from client                            
      19982  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 100

Elapsed: 00:00:10.06

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      25896  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   87745341  bytes sent via SQL*Net to client                                  
     110410  bytes received via SQL*Net from client                            
       9992  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 500

Elapsed: 00:00:07.14

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      18162  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86282805  bytes sent via SQL*Net to client                                  
      22498  bytes received via SQL*Net from client                            
       2000  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 1000

Elapsed: 00:00:06.77

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      17195  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   86099988  bytes sent via SQL*Net to client                                  
      11509  bytes received via SQL*Net from client                            
       1001  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    
---
SQL> SET ARRAYSIZE 5000

Elapsed: 00:00:06.66

Statistics
----------------------------------------------------------                     
          0  recursive calls                                                   
          0  db block gets                                                     
      16421  consistent gets                                                   
          0  physical reads                                                    
          0  redo size                                                         
   85953635  bytes sent via SQL*Net to client                                  
       2709  bytes received via SQL*Net from client                            
        201  SQL*Net roundtrips to/from client                                 
          0  sorts (memory)                                                    
          0  sorts (disk)                                                      
     999001  rows processed                                                    

Running the test directly on the server had a significant impact on the execution time.

Summarizing the performance results:

802.11G – Laptop and wireless access point set to 802.11G (54Mb/s)
802.11N – Laptop and wireless access point set to 802.11N speed
Netbook N – Netbook with Atom N450 CPU and 802.11N speed
100Mb/s – Laptop with wired 100Mb/s network connection
1000Mb/s – Desktop with wired gigabit network connection
On Server – Executed directly on the server’s console (no network)

The Windows Vista client has a built in Resource Monitor that helps when it is necessary to associate CPU usage, disk performance, network transfer speed, and memory utilization.  The following screen capture was generated during the 802.11G transfer test with the array fetch size set to 500:

The Windows 7 client also has a built in Resource Monitor that helps when it is necessary to associate CPU usage, disk performance, network transfer speed, and memory utilization.  The following screen capture was generated during the Netbook N transfer test with the array fetch size set to 500 – notice the CPU usage for the System Interrupts is shown as consuming 30% of the available CPU time.  However, because the netbook has a single core hyperthreaded CPU, the actual utilization for System Interrupts might actually be twice as high (60%) due to hyperthreading simulating the existence of a second CPU:

While the above tests may have limited value for your specific project, the tests do show the potential impact of SQL*Net compression, the performance impact of different network communication links, and the potential impact the network communication has on the client’s CPU utilization.








Follow

Get every new post delivered to your Inbox.

Join 144 other followers