Disabling AUTOTRACE Causes “SP2-0575: Use of Oracle SQL feature not in SQL92 Entry Level” on 11.2.0.1

2 10 2010

October 2, 2010

I encountered a fun problem today while performing a couple of tests with the AUTOTRACE feature in SQL*Plus.  I am using a 32 bit 11.2.0.1 Oracle client connected to a 64 bit 11.2.0.1 Oracle database.  It seems that 50% of the time when I issue “SET AUTOTRACE OFF” or “set autotrace off” I receive the SP2-0575 error and my session is disconnected from the database, although it seems to happen randomly when disabling AUTOTRACE.  Interesting problem I think.  Here is a partial output from the script:

SQL> ALTER SYSTEM SET OPTIMIZER_INDEX_COST_ADJ=10 SCOPE=MEMORY;

System altered.

SQL> ALTER SYSTEM FLUSH SHARED_POOL;

System altered.

SQL> HOST sleep 10
SQL>
SQL>SPOOL testresults11201CostAdj10.txt
SQL>
SQL> SHOW PARAMETER OPTIMIZER

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------
optimizer_capture_sql_plan_baselines boolean     FALSE
optimizer_dynamic_sampling           integer     2
optimizer_features_enable            string      11.2.0.1
optimizer_index_caching              integer     0
optimizer_index_cost_adj             integer     10
optimizer_mode                       string      ALL_ROWS
optimizer_secure_view_merging        boolean     TRUE
optimizer_use_invisible_indexes      boolean     FALSE
optimizer_use_pending_statistics     boolean     FALSE
optimizer_use_sql_plan_baselines     boolean     TRUE
SQL> SET AUTOTRACE TRACEONLY EXPLAIN
SQL>
(several select statements similar to the following were executed, see yesterday's article for the complete list)
SQL> SELECT
  2    T1.COL1,
  3    T1.COL2,
  4    T1.COL3,
  5    T2.COL1,
  6    T2.COL2,
  7    T2.COL3
  8  FROM
  9    T1,
 10    T2
 11  WHERE
 12    T1.COL1=T2.COL1
 13    AND T1.COL1 BETWEEN 1 AND 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 1945774515

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |     9 |  3816 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |              |       |       |            |          |
|   2 |   NESTED LOOPS                |              |     9 |  3816 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1           |    10 |  2120 |     1   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0018298 |    10 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN          | SYS_C0018300 |     1 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T2           |     1 |   212 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."COL1">=1 AND "T1"."COL1"<=10)
   5 - access("T1"."COL1"="T2"."COL1")
       filter("T2"."COL1"<=10 AND "T2"."COL1">=1)

SQL>
SQL> SPOOL OFF
SQL> set autotrace off
SP2-0575: Use of Oracle SQL feature not in SQL92 Entry Level.
SQL>
SQL> ALTER SYSTEM SET OPTIMIZER_INDEX_COST_ADJ=1 SCOPE=MEMORY;
ERROR:
ORA-03114: not connected to ORACLE

The problem is not specific to the modification of the OPTIMIZER_INDEX_COST_ADJ parameter, and the error does not always happen after changing a specific parameter. 

Metalink (MOS) Doc ID 563269.1 does not apply:

SQL> select owner, object_type from all_objects where object_name='DUAL';

OWNER      OBJECT_TYPE
---------- -------------------
SYS        TABLE
PUBLIC     SYNONYM

FIPS compliance checking (FLAGGER) (reference 2) should not apply:

SQL> show flagger
flagger OFF

As I mentioned, an interesting problem – yet a bit of a headache inducer when trying to use AUTOTRACE.





First Table is 550MB, Second Table is 26GB – Nested Loops or Full Table Scan?

1 10 2010

October 1, 2010

(Forward to the Next Post in the Series)

A couple of days ago I wrote a blog article about creating test case scripts to test ideas.  In a recent OTN thread the original poster asked about why his value for the DB_FILE_MULTIBLOCK_READ_COUNT parameter is not being honored.  The OP stated that his DB_FILE_MULTIBLOCK_READ_COUNT parameter was set to 128, yet when reviewing a 10046 trace at level 8 he was seeing that a maximum of 16 blocks were read from disk at a time while perfoming full table scans.  So, what might be limiting the multi-block reads to 16 blocks (I believe that one of my previous blog article mentioned the limiting factors)?  This would be a great use for a test case script, although with 26GB and 550MB tables you might want to consider reducing the size of the tables when building the test case script.

Interesting, however the unexpectedly low multi-block read size is not the subject of this blog article.  Someone mentioned in the OTN thread that when joining the 550MB table to the 26GB table, a nested loops join should be used because hashing could take a lot of time.  I was not expecting to see a comment like that, so I started to wonder, could it be true?  I wonder if one of my earlier blog articles touched on this idea?  Maybe I overlooked something in the documentation:

“The database uses hash joins to join large data sets. The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory. It then scans the larger table, probing the hash table to find the joined rows.

This method is best when the smaller table fits in available memory. The cost is then limited to a single read pass over the data for the two tables.

11.3.4.1 When the Optimizer Uses Hash Joins
The optimizer uses a hash join to join two tables if they are joined using an equijoin and if either of the following conditions are true:

  • A large amount of data must be joined.
  • A large fraction of a small table must be joined.”

Does the situation change if there is no index on the join columns in either table?  Does the situation change if there is no index on any of the predicates in the WHERE clause?  Should Oracle Database repeatedly full scan the 550MB or the 26GB table while performing a nested loops join?  What is a large fraction of a small table, 75% of the rows, 50% of the rows, 10% of the rows, 1% of the rows?  Maybe the use of a nested loops join in such a case needs a bit more consideration.  Of course we could create a test case script to determine if this statement in the OTN thread is correct, although we may need to hint the join method in the SELECT statement to test the performance with both join methods.

The same person later suggested that if the first table is small (I guess that the 550MB table is the small one) and the driven table (I suspect that the 26GB table matches this description) contains an index on the join column, Oracle’s optimizer will favor nested loops joins over hash joins.  It might be interesting to construct a test case to see if this is true (Oracle Database 11.1.0.7 used for the test case results displayed below).  To save time I will create tables which are a bit smaller than the original 550MB and 26GB tables, which might throw off the test case a little in favor of nested loops joins, but I will keep an eye on the output for problems.  In my test case script, table T1 will be my “large” table and table T2 will be my “small” table.

CREATE TABLE T1 (
  COL1 NUMBER,
  COL2 NUMBER,
  COL3 VARCHAR2(200),
  PRIMARY KEY(COL1));

INSERT INTO
  T1
SELECT
  ROWNUM,
  10000000-ROWNUM,
  RPAD(TO_CHAR(ROWNUM),200,'A')
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000) V1,
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000) V2;

COMMIT;

ALTER TABLE T1 MODIFY(COL2 NOT NULL);

CREATE INDEX IND_T1_COL2 ON T1(COL2);

CREATE TABLE T2 (
  COL1 NUMBER,
  COL2 NUMBER,
  COL3 VARCHAR2(200),
  PRIMARY KEY(COL1));

INSERT INTO
  T2
SELECT
  ROWNUM,
  10000000-ROWNUM,
  RPAD(TO_CHAR(ROWNUM),200,'A')
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=100000);

COMMIT;

ALTER TABLE T2 MODIFY(COL2 NOT NULL);

CREATE INDEX IND_T2_COL2 ON T2(COL2);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)

We now have two simple tables with a column having an ascending number sequence, a column have a descending number sequence, and a 200 byte padding column to discourage full table scans.  Now let’s perform a quick test:

SET AUTOTRACE TRACEONLY EXPLAIN
SET LINESIZE 140
SET PAGESIZE 1000
SET TRIMSPOOL ON

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2,
  T2.COL3
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1;

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

-----------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100K|    40M|       | 28696   (1)| 00:01:57 |
|*  1 |  HASH JOIN         |      |   100K|    40M|    21M| 28696   (1)| 00:01:57 |
|   2 |   TABLE ACCESS FULL| T2   |   100K|    20M|       |   379   (2)| 00:00:02 |
|   3 |   TABLE ACCESS FULL| T1   |  1000K|   201M|       | 23406   (1)| 00:01:35 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."COL1"="T2"."COL1")

Two full table scans and a hash join.  If we retrieve all of the columns and all of the rows, the optimizer apparently will not favor nested loops joins.  Let’s try again with a slightly different query, this time using the column with the descending number sequence as the join column:

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2,
  T2.COL3
FROM
  T1,
  T2
WHERE
  T1.COL2=T2.COL2;

Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835

-----------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100K|    40M|       | 28696   (1)| 00:01:57 |
|*  1 |  HASH JOIN         |      |   100K|    40M|    21M| 28696   (1)| 00:01:57 |
|   2 |   TABLE ACCESS FULL| T2   |   100K|    20M|       |   379   (2)| 00:00:02 |
|   3 |   TABLE ACCESS FULL| T1   |  1000K|   201M|       | 23406   (1)| 00:01:35 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."COL2"="T2"."COL2")

Still a hash join and two full table scans.  Again, the optimizer did not favor nested loops joins.  Let’s try another query that specifies a range of values for the column with the ascending values:

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2,
  T2.COL3
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL1 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 1754270684

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |   999 |   411K|    72   (2)| 00:00:01 |
|*  1 |  HASH JOIN                   |              |   999 |   411K|    72   (2)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1           |  1000 |   206K|    37   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0017027 |  1000 |       |     4   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2           |  1000 |   206K|    34   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | SYS_C0017028 |  1000 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."COL1"="T2"."COL1")
   3 - access("T1"."COL1">=1 AND "T1"."COL1"<=1000)
   5 - access("T2"."COL1">=1 AND "T2"."COL1"<=1000)

Still a hash join, although the indexes on both tables were used to reduce the number of rows entering the hash join.  So, in this case when selecting 1% (1,000 / 100,000) of the rows from tables, the optimizer still selected to perform a hash join.  Let’s try another SQL statement, this time putting the restriction on the column with the descending values (note that this query, if executed, will not return any rows because the values in column T2.COL2 do not fall into the range between 1 to 1,000):

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2,
  T2.COL3
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL2 BETWEEN 1 AND 1000;

Execution Plan
----------------------------------------------------------
Plan hash value: 596667293

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |     1 |   422 |     5   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |              |       |       |            |          |
|   2 |   NESTED LOOPS                |              |     1 |   422 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1           |     1 |   211 |     4   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IND_T1_COL2  |     1 |       |     3   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN          | SYS_C0017028 |     1 |       |     0   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T2           |     1 |   211 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T1"."COL2">=1 AND "T1"."COL2"<=1000)
   5 - access("T1"."COL1"="T2"."COL1")

As you can probably tell from the above plan, the optimizer believes that 1 or fewer rows will be returned, so the optimizer changed from performing a single hash join to performing two nested loops joins.  Let’s try another query, this time we will not retrieve column T2.COL3 (the large padding column) and change the WHERE clause to pick up all COL1 values less than 10,000:

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL1 < 10000;

Execution Plan
----------------------------------------------------------
Plan hash value: 218283169

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |  9998 |  2167K|   666   (1)| 00:00:03 |
|*  1 |  HASH JOIN                   |                  |  9998 |  2167K|   666   (1)| 00:00:03 |
|*  2 |   VIEW                       | index$_join$_002 |  9999 |   107K|   323   (2)| 00:00:02 |
|*  3 |    HASH JOIN                 |                  |       |       |            |          |
|*  4 |     INDEX RANGE SCAN         | SYS_C0017028     |  9999 |   107K|    22  (10)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN     | IND_T2_COL2      |  9999 |   107K|   301   (1)| 00:00:02 |
|   6 |   TABLE ACCESS BY INDEX ROWID| T1               |  9999 |  2060K|   343   (1)| 00:00:02 |
|*  7 |    INDEX RANGE SCAN          | SYS_C0017027     |  9999 |       |    21   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."COL1"="T2"."COL1")
   2 - filter("T2"."COL1"<10000)
   3 - access(ROWID=ROWID)
   4 - access("T2"."COL1"<10000)
   7 - access("T1"."COL1"<10000)

The optimizer again did not select to perform nested loop joins, instead performing two hash joins.

Let’s put the restriction on column C2 and try again:

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL2 < 10000;

Execution Plan
----------------------------------------------------------
Plan hash value: 596667293

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |     1 |   222 |     5   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |              |       |       |            |          |
|   2 |   NESTED LOOPS                |              |     1 |   222 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1           |     1 |   211 |     4   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IND_T1_COL2  |     1 |       |     3   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN          | SYS_C0017028 |     1 |       |     0   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T2           |     1 |    11 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T1"."COL2"<10000)
   5 - access("T1"."COL1"="T2"."COL1")

The optimizer predicted that a single row would be returned, so it decided to use two nested loops joins rather than a hash join.  Let’s try again, but this time fix the SQL statement so that it will return rows if executed:

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL2 > (10000000-10000);

Execution Plan
----------------------------------------------------------
Plan hash value: 570481587

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |  9999 |  2167K|       |   804   (2)| 00:00:04 |
|*  1 |  HASH JOIN                   |             |  9999 |  2167K|  2184K|   804   (2)| 00:00:04 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1          |  9999 |  2060K|       |   331   (1)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | IND_T1_COL2 |  9999 |       |       |    26   (0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL          | T2          |   100K|  1074K|       |   379   (2)| 00:00:02 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."COL1"="T2"."COL1")
   3 - access("T1"."COL2">9990000)

Back to a hash join again, so nothing magic about having a column with a descending sequence of numbers causing the nested loops join.  Let’s try again, this time retrieving only 100 rows, rather than 1,000 or 9,999:

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2,
  T2.COL3
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL1 BETWEEN 1 AND 100;

Plan hash value: 1754270684

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |    99 | 41778 |    14   (8)| 00:00:01 |
|*  1 |  HASH JOIN                   |              |    99 | 41778 |    14   (8)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1           |   100 | 21100 |     7   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0017027 |   100 |       |     3   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2           |   100 | 21100 |     6   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | SYS_C0017028 |   100 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."COL1"="T2"."COL1")
   3 - access("T1"."COL1">=1 AND "T1"."COL1"<=100)
   5 - access("T2"."COL1">=1 AND "T2"."COL1"<=100)

A hash join again.  What if we decrease the number of rows from 100 to 10?

SELECT
  T1.COL1,
  T1.COL2,
  T1.COL3,
  T2.COL1,
  T2.COL2,
  T2.COL3
FROM
  T1,
  T2
WHERE
  T1.COL1=T2.COL1
  AND T1.COL1 BETWEEN 1 AND 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 2528765105

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |     9 |  3798 |     5   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |              |       |       |            |          |
|   2 |   NESTED LOOPS                |              |     9 |  3798 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1           |    10 |  2110 |     4   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SYS_C0017027 |    10 |       |     3   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN          | SYS_C0017028 |     1 |       |     0   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T2           |     1 |   211 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T1"."COL1">=1 AND "T1"."COL1"<=10)
   5 - access("T1"."COL1"="T2"."COL1")
       filter("T2"."COL1"<=10 AND "T2"."COL1">=1)

Apparently, dropping the number of rows to 10 (0.01% of the rows in table T2, if my calculations are correct) was sufficient to make the optimizer switch from a hash join to two nested loops joins.

I will now ask the question, “If the first table is small and the driven table contains an index on the join column, will Oracle’s optimizer favor nested loops joins over hash joins?”  Could someone generate a test case that shows just the opposite of the results found in my test case?  Does the answer change if I were to repeat the test case script on Oracle Database 8.1.7.4 or 11.2.0.2?  What if instead of one 200 byte padding column there were fifty 10 byte padding columns, and the query only selected one of those padding columns from each table?  What if instead of one 200 byte padding column there was only a single 10 byte padding column?  Is it safe to make a generalization at this point?