December 4, 2009
Note: the Oracle cost-based query optimizer is able to automatically transform queries submitted by applications to produce an equivalent SQL statement that will hopefully improve SQL execution performance. The possible transformations are limited by the Oracle release version, the value of the OPTIMIZER_FEATURES_ENABLE parameter (and various other hidden parameters), the potential presence of NULL values, and hints embedded in the SQL statement.
Some time ago the following question appeared in a forum:
There are three tables A B C with 18 Million, 6 M and 8M records respectivly.select count(1) from ( Select * from A where not exists ( select 'x' from B where join on Keys ) and not exists ( select 'x' from C where join on keys ) )
The explain plan shows that it is using all key indexes for all tables and cost is 7366 for table A Index full scan. And other tables have a index Range scan 3 and 4. What could be a reason for this discrepency.
We are using oracle 9i , Linux operating system. For confidentiality I have renamed the tables
Query:select /*+ parallel */ * from A where not exists (select /*+ parallel */ 'X' from b where stud_id = A.stud_id and cpnt_typ_id = A.cpnt_typ_id and cpnt_id = A.cpnt_id and rev_dte = A.rev_dte and cmpl_stat_id = A.cmpl_stat_id and compl_dte = a.compl_dte) and not exists (select /*+ parallel */ 'X' from c where stud_id = A.stud_id and cpnt_typ_id =A.cpnt_typ_id and cpnt_id = A.cpnt_id and rev_dte = A.rev_dte and cmpl_stat_id = A.cmpl_stat_id and compl_dte = A.compl_dte)
SELECT STATEMENT, GOAL = CHOOSE Cost=7400 Cardinality= 1 Bytes= 62 IO cost=7400 SORT AGGREGATE Cardinality= 1 Bytes= 62 FILTER INDEX FAST FULL SCAN PK_A Cost=7393 Cardinality=46754 Bytes=2898748 IO cost=7393 INDEX RANGE SCAN IDX_B_1 Cost= 3 Cardinality= 1 Bytes= 58 IO cost= 3 TABLE ACCESS BY INDEX ROWID PA_C Cost= 4 Cardinality= 1 Bytes= 60 IO cost= 4 INDEX RANGE SCAN Object IX_C Cost= 3 Cardinality= 1 IO cost= 3
With the information provided, the best that I was able to offer is the following, with a manually transformed version of the SQL statement from the NOT EXISTS syntax into inline views:
The Oracle version in part determines how Oracle will optimize and execute the SQL statement that you submit. Certain transformations are performed automatically in 9i for NOT EXISTS queries, regardless of whether or not the cost will increase due to the transformation. The cardinality numbers reported in the explain plan (a DBMS Xplan would be better) do not appear to be consistent with three tables having 18 million, 6 million and 8 million rows. When collecting statistics, make certain that the DBMS_STATS routine is called with CASCADE=>TRUE specified.
Jonathan Lewis has commented on his blog a couple times about using PARALLEL, which only resulted in even longer execution times. You might try reformatting the SQL statement into an equivalent form, and removing the PARALLEL hints. Something like this:
SELECT A.* FROM A, (SELECT DISTINCT STUD_ID, CPNT_TYP_ID, CPNT_ID, REV_DTE, CMPL_STAT_ID, COMPL_DTE FROM B) B, (SELECT DISTINCT STUD_ID, CPNT_TYP_ID, CPNT_ID, REV_DTE, CMPL_STAT_ID, COMPL_DTE FROM C) C WHERE A.STUD_ID = B.STUD_ID(+) AND A.CPNT_TYP_ID = B.CPNT_TYP_ID(+) AND A.CPNT_ID = B.CPNT_ID(+) AND A.REV_DTE = B.REV_DTE(+) AND A.CMPL_STAT_ID = B.CMPL_STAT_ID(+) AND A.COMPL_DTE = B.COMPL_DTE(+) AND B.STUD_ID IS NULL AND A.STUD_ID = C.STUD_ID(+) AND A.CPNT_TYP_ID = C.CPNT_TYP_ID(+) AND A.CPNT_ID = C.CPNT_ID(+) AND A.REV_DTE = C.REV_DTE(+) AND A.CMPL_STAT_ID = C.CMPL_STAT_ID(+) AND A.COMPL_DTE = C.COMPL_DTE(+) AND C.STUD_ID IS NULL;
Compare the performance of the above with the performance of your original SQL statement using a 10046 trace at level 8. The above will likely result in 2 sorting operations in 9i, so the SORT_AREA_SIZE parameter may need to be adjusted.
Any other ideas?