January 7, 2011
In a recent OTN thread a developer described their difficulty in working with ANSI SQL in Oracle Database 8i (quick show of hands, how many people know why?). The OP eventually asked:
“I don’t suppose there are any normal-query-sytax-to-freakishly-old-syntax converters out there?”
And John Spencer jokingly replied back:
“Personally, I’m looking for a freakishly-new-syntax-to-normal-query-sytax converter.”
l thought that this might be an interesting challenge. A developer is struggling to adjust to Oracle Database, having come from a Microsoft SQL Server development background, and he is looking for a translator that will convert his ANSI SQL to SQL that will work with Oracle Database 8i. My reply indicated that Oracle provides a free ANSI SQL to Oracle specific SQL translator and tutoring tool in the free Oracle XE. A free translator and tutor? Sure, below is an example using Oracle Database 11.2.0.1. First, let’s create a couple of tables and collect the statistics for those tables:
CREATE TABLE T1 AS SELECT ROWNUM C1, LPAD('A',100,'A') PADDING FROM DUAL CONNECT BY LEVEL<=100; CREATE TABLE T2 AS SELECT ROWNUM C0, CEIL(ROWNUM/20) C1, LPAD('A',100,'A') PADDING FROM DUAL CONNECT BY LEVEL<=100*20; CREATE TABLE T3 AS SELECT ROWNUM C0, TRUNC((MOD(ROWNUM-1,80)+1)*1.2) C1, LPAD('A',100,'A') PADDING FROM DUAL CONNECT BY LEVEL<=1000; ALTER TABLE T1 MODIFY (C1 NOT NULL); ALTER TABLE T2 MODIFY (C1 NOT NULL); ALTER TABLE T3 MODIFY (C1 NOT NULL); EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1') EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2') EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3')
We now have three tables with a NOT NULL constraint on a column. Let’s assume that the following two ANSI SQL statements are crafted to access the three tables:
SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 IN ( SELECT T3.C1 FROM T3); SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 NOT IN ( SELECT T3.C1*2 FROM T3);
Note that in the above, one of the SQL statements contains an IN clause, and the other a NOT IN clause. That IN clause could also be written as an EXISTS clause (because of the NOT NULL constraint on column C1), or as an inline view. Oracle Database 8i will NOT automatically transform an IN clause into an EXISTS clause, even when it is much more efficient to execute the SQL statement with the EXISTS clause (more recent release versions of Oracle Database might perform the transformation). So let’s try an experiment (if you are running Oracle Database 8i, will have to manually convert the above ANSI joins into Oracle specific syntax to try this on Oracle Database 8i):
ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM'; ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1'; SET AUTOTRACE TRACEONLY STATISTICS SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 IN ( SELECT T3.C1 FROM T3); ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM2'; SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 NOT IN ( SELECT T3.C1*2 FROM T3);
Now take a look inside the 10053 trace files, you might find equivalent, transformed versions of the SQL statements like these (reformatted to add extra whitespace) in the WATCH_TRANSFORM trace file (note that in the thread Yasu mentioned that he received a different final transformation when connected as the SYS user than he did when connected as a normal user):
SELECT "T1"."C1" "QCSJ_C000000000300000", "T1"."PADDING" "QCSJ_C000000000300002", "from$_subquery$_005"."C1_1" "QCSJ_C000000000300001", "from$_subquery$_005"."PADDING_0" "QCSJ_C000000000300003" FROM "TESTUSER"."T1" "T1", LATERAL( (SELECT "T2"."PADDING" "PADDING_0", "T2"."C1" "C1_1" FROM "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005" SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"=ANY ( SELECT "T3"."C1" "C1" FROM "TESTUSER"."T3" "T3") AND "T1"."C1"="T2"."C1"(+) SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T3" "T3", "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"(+) AND "T1"."C1"="T3"."C1"
If you are looking at the above, trying to learn alternate SQL syntax that may be more efficient than what you have used in the past, you should note that the last of the above SQL statements is technically not logically equivalent to the submitted SQL statement due to the potential for duplicate values in column C1 in table T3 (also note that the first SQL statement is missing the IN clause). However, the optimizer in 11.2.0.1 understands how to handle this join as a semi-join, and will stop the join to the second table when the first matching row is found.
In the WATCH_TRANSFORM2 trace file, we see the NOT IN version of the SQL statement:
SELECT "T1"."C1" "QCSJ_C000000000300000", "T1"."PADDING" "QCSJ_C000000000300002", "from$_subquery$_005"."C1_1" "QCSJ_C000000000300001", "from$_subquery$_005"."PADDING_0" "QCSJ_C000000000300003" FROM "TESTUSER"."T1" "T1", LATERAL( (SELECT "T2"."PADDING" "PADDING_0", "T2"."C1" "C1_1" FROM "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005" SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"<>ALL ( SELECT "T3"."C1"*2 "T3.C1*2" FROM "TESTUSER"."T3" "T3") AND "T1"."C1"="T2"."C1"(+) SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T3" "T3", "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T3"."C1"*2 AND "T1"."C1"="T2"."C1"(+)
If you are looking at the above, trying to learn alternate SQL syntax that may be more efficient than what you have used in the past, you should note that the last of the above SQL statements is technically not logically equivalent to the submitted SQL statement due to the potential for duplicate values in column C1 in table T3. However, the optimizer in 11.2.0.1 understands how to handle this join as a anti-join, and will stop the join to the second table when the first matching row is found.
Let’s try again with a change to the OPTIMIZER_FEATURES_ENABLE parameter:
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='8.1.7'; ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM3'; SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 IN ( SELECT T3.C1 FROM T3); ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM4'; SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 NOT IN ( SELECT T3.C1*2 FROM T3);
Now you might see something like this in the WATCH_TRANSFORM3 trace file (the IN syntax):
SELECT "T1"."C1" "QCSJ_C000000000300000", "T1"."PADDING" "QCSJ_C000000000300002", "T2"."C1" "C1", "T2"."PADDING" "PADDING" FROM "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"(+) SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"=ANY ( SELECT "T3"."C1" "C1" FROM "TESTUSER"."T3" "T3") AND "T1"."C1"="T2"."C1"(+) SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM (SELECT DISTINCT "T3"."C1" "C1" FROM "TESTUSER"."T3" "T3") "VW_NSO_1", "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"="VW_NSO_1"."C1" AND "T1"."C1"="T2"."C1"(+)
Notice in the above that final version of the SQL statement shows the IN clause transformed into an inline view.
In the WATCH_TRANSFORM4 trace file you might see something like this (the NOT IN syntax):
SELECT "T1"."C1" "QCSJ_C000000000300000", "T1"."PADDING" "QCSJ_C000000000300002", "T2"."C1" "C1", "T2"."PADDING" "PADDING" FROM "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"(+) SELECT "T1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T1" "T1", "TESTUSER"."T2" "T2" WHERE "T1"."C1"<>ALL ( SELECT "T3"."C1"*2 "T3.C1*2" FROM "TESTUSER"."T3" "T3") AND "T1"."C1"="T2"."C1"(+) SELECT "SYS_ALIAS_1"."PADDING" "T1_PADDING", "T2"."PADDING" "T2_PADDING" FROM "TESTUSER"."T1" "SYS_ALIAS_1", "TESTUSER"."T2" "T2" WHERE NOT EXISTS ( SELECT 0 FROM "TESTUSER"."T3" "T3" WHERE "T3"."C1"*2="SYS_ALIAS_1"."C1") AND "SYS_ALIAS_1"."C1"="T2"."C1"(+)
Notice in the above, the odd last transformation. Oracle Database 8i is not supposed to be able to transform a SQL statement using a NOT IN clause into a SQL statement using a NOT EXISTS clause, even though it might be more efficient.
Just for fun, let’s see what happens when we add a NO_QUERY_TRANSFORMATION hint after the SELECT in the SQL statements while leaving the modified OPTIMIZER_FEATURES_ENABLE parameter set:
ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM5'; SELECT /*+ NO_QUERY_TRANSFORMATION */ T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 IN ( SELECT T3.C1 FROM T3); ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM6'; SELECT /*+ NO_QUERY_TRANSFORMATION */ T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 WHERE T1.C1 NOT IN ( SELECT T3.C1*2 FROM T3);
The WATCH_TRANSFORM5 trace file (IN syntax) might contain SQL statements that look something like this:
SELECT "T1"."C1" "QCSJ_C000000000300000", "T1"."PADDING" "QCSJ_C000000000300002", "from$_subquery$_005"."C1_1" "C1", "from$_subquery$_005"."PADDING_0" "PADDING" FROM "TESTUSER"."T1" "T1", LATERAL( (SELECT "T2"."PADDING" "PADDING_0", "T2"."C1" "C1_1" FROM "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005" SELECT "SYS_ALIAS_1"."QCSJ_C000000000300002_1" "T1_PADDING", "SYS_ALIAS_1"."PADDING_3" "T2_PADDING" FROM (SELECT "T1"."C1" "QCSJ_C000000000300000_0", "T1"."PADDING" "QCSJ_C000000000300002_1", "from$_subquery$_005"."C1_1" "C1", "from$_subquery$_005"."PADDING_0" "PADDING_3" FROM "TESTUSER"."T1" "T1", LATERAL( (SELECT "T2"."PADDING" "PADDING_0", "T2"."C1" "C1_1" FROM "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005") "SYS_ALIAS_1" WHERE EXISTS ( SELECT 0 FROM "TESTUSER"."T3" "T3" WHERE "T3"."C1"="SYS_ALIAS_1"."QCSJ_C000000000300000_0")
The last of the above SQL statements might be manually cleaned up to look like this:
SELECT T1.PADDING T1_PADDING, T2.PADDING T2_PADDING FROM T1, T2 WHERE T1.C1=T2.C1(+) AND EXISTS ( SELECT 0 FROM T3 WHERE T3.C1=T1.C1);
Notice in the above, that again the optimizer has transformed the IN syntax into EXISTS syntax, even though Oracle Database 8i did not support that automatic transformation (and we did specifically request no query transformations through the use of the hint).
The WATCH_TRANSFORM6 trace file (NOT IN syntax) might contain SQL statements that look something like this:
SELECT "T1"."C1" "QCSJ_C000000000300000", "T1"."PADDING" "QCSJ_C000000000300002", "from$_subquery$_005"."C1_1" "C1", "from$_subquery$_005"."PADDING_0" "PADDING" FROM "TESTUSER"."T1" "T1", LATERAL( (SELECT "T2"."PADDING" "PADDING_0", "T2"."C1" "C1_1" FROM "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005" SELECT "SYS_ALIAS_1"."QCSJ_C000000000300002_1" "T1_PADDING", "SYS_ALIAS_1"."PADDING_3" "T2_PADDING" FROM (SELECT "T1"."C1" "QCSJ_C000000000300000_0", "T1"."PADDING" "QCSJ_C000000000300002_1", "from$_subquery$_005"."C1_1" "C1", "from$_subquery$_005"."PADDING_0" "PADDING_3" FROM "TESTUSER"."T1" "T1", LATERAL( (SELECT "T2"."PADDING" "PADDING_0", "T2"."C1" "C1_1" FROM "TESTUSER"."T2" "T2" WHERE "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005") "SYS_ALIAS_1" WHERE NOT EXISTS ( SELECT 0 FROM "TESTUSER"."T3" "T3" WHERE "T3"."C1"*2="SYS_ALIAS_1"."QCSJ_C000000000300000_0")
Another NOT IN to NOT EXISTS transformation.
—-
In all of the above cases, the first transformed SQL statement printed in the 10053 trace file excluded the IN and the NOT IN clauses, but did include the ANSI to Oracle specific join syntax conversion.
It is nice to know that Oracle Corp. throws in something free with Oracle Database: a free ANSI SQL to Oracle Specific SQL translator, and a free SQL tutor that teaches rewriting SQL statements into alternate SQL syntax that just might be more efficient than the original SQL statement (you might be able to use the more efficient version with older Oracle Database releases, as was the case for the NOT IN to TO EXISTS transformation).
Nice idea! Now if someone would just write us a little external table definition so we could query it easily instead of having to dig through the trace file that would be awesome.
Kerry
Kerry,
Thank you for the compliment. I wonder if an external table dfinition is possible? The format of the 10053 trace file seems to have changed quite a bit between releases, so it might be challenging to even write a utility to extract the SQL statements, but it might be a fun exercise.
—
In this article I mentioned that someone in the OTN thread by the name of “Yasu” noticed a different final execution plan when the SQL statements were executed as the SYS user. I thought that it might be interesting to test Yasu’s statement, so here are the SQL statements found in the first trace file ‘WATCH_TRANSFORM’ from a 11.2.0.2 database for the SYS user and a normal user:
WATCH_TRANSFORM for SYS user:
Final query for SYS:
—————————————-
WATCH_TRANSFORM for a normal user:
Final query for SYS:
If the output is examined closely, the third SQL statement listed for the two users is very different. This little test might explain why the SYS user could see different performance characteristics than would a normal database user.
What is the meaning of “QCSJ_C000000000300002” ?? I did a join using Oracle server / SQL, But it replaced one of my columns titled “EMPLID” with a column titled “QCSJ_C000000000300002”. I have no idea why! Thanks for any help…
Kalin,
It has been several years since I worked with this type of information extensively. A Google search also does not seem to provide an answer, and a scan through my copy of “Cost-Based Oracle Fundamentals” also did not find a hint of a suggestion.
This article utilizes Oracle’s cost based query optimizer to generate the Oracle specific syntax. As such, I am thinking that the columns were named as part of the semi-join optimization step, which suggests that the QCSJ prefix is short for “query column semi-join”. The optimizer had to have a valid and unique name for the column that was generated in the transformation, so it created one with the QCSJ prefix. Jonathan Lewis describes semi-joins (in brief: used for exists, not exists, in, not in blocks in the query if my notes are correct) in his “Cost-Based Oracle Fundamentals” book. You may read a few sections through Google books by using the following link:
https://books.google.com/books?id=RyhD1xfENXYC&printsec=frontcover&q=semi-join&f=false