Free ANSI SQL to Oracle Specific SQL Translator and SQL Tutor

7 01 2011

January 7, 2011

In a recent OTN thread a developer described their difficulty in working with ANSI SQL in Oracle Database 8i (quick show of hands, how many people know why?).  The OP eventually asked:

“I don’t suppose there are any normal-query-sytax-to-freakishly-old-syntax converters out there?”

And John Spencer jokingly replied back:

“Personally, I’m looking for a freakishly-new-syntax-to-normal-query-sytax converter.”

l thought that this might be an interesting challenge.  A developer is struggling to adjust to Oracle Database, having come from a Microsoft SQL Server development background, and he is looking for a translator that will convert his ANSI SQL to SQL that will work with Oracle Database 8i.  My reply indicated that Oracle provides a free ANSI SQL to Oracle specific SQL translator and tutoring tool in the free Oracle XE.  A free translator and tutor?  Sure, below is an example using Oracle Database 11.2.0.1.  First, let’s create a couple of tables and collect the statistics for those tables:

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  LPAD('A',100,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=100;

CREATE TABLE T2 AS
SELECT
  ROWNUM C0,
  CEIL(ROWNUM/20) C1,
  LPAD('A',100,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=100*20;

CREATE TABLE T3 AS
SELECT
  ROWNUM C0,
  TRUNC((MOD(ROWNUM-1,80)+1)*1.2) C1,
  LPAD('A',100,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=1000;

ALTER TABLE T1 MODIFY (C1 NOT NULL);
ALTER TABLE T2 MODIFY (C1 NOT NULL);
ALTER TABLE T3 MODIFY (C1 NOT NULL);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1')
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2')
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3') 

We now have three tables with a NOT NULL constraint on a column. Let’s assume that the following two ANSI SQL statements are crafted to access the three tables:

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 IN (
    SELECT
      T3.C1
    FROM
      T3);

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 NOT IN (
    SELECT
      T3.C1*2
    FROM
      T3); 

Note that in the above, one of the SQL statements contains an IN clause, and the other a NOT IN clause.  That IN clause could also be written as an EXISTS clause (because of the NOT NULL constraint on column C1), or as an inline view. Oracle Database 8i will NOT automatically transform an IN clause into an EXISTS clause, even when it is much more efficient to execute the SQL statement with the EXISTS clause (more recent release versions of Oracle Database might perform the transformation).  So let’s try an experiment (if you are running Oracle Database 8i, will have to manually convert the above ANSI joins into Oracle specific syntax to try this on Oracle Database 8i):

ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM'; 
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
SET AUTOTRACE TRACEONLY STATISTICS

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 IN (
    SELECT
      T3.C1
    FROM
      T3);

ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM2';

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 NOT IN (
    SELECT
      T3.C1*2
    FROM
      T3); 

Now take a look inside the 10053 trace files, you might find equivalent, transformed versions of the SQL statements like these (reformatted to add extra whitespace) in the WATCH_TRANSFORM trace file (note that in the thread Yasu mentioned that he received a different final transformation when connected as the SYS user than he did when connected as a normal user):

SELECT
  "T1"."C1" "QCSJ_C000000000300000",
  "T1"."PADDING" "QCSJ_C000000000300002",
  "from$_subquery$_005"."C1_1" "QCSJ_C000000000300001",
  "from$_subquery$_005"."PADDING_0" "QCSJ_C000000000300003"
FROM
  "TESTUSER"."T1" "T1",
  LATERAL(
    (SELECT
       "T2"."PADDING" "PADDING_0",
       "T2"."C1" "C1_1"
     FROM
       "TESTUSER"."T2" "T2"
     WHERE
       "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005"

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"=ANY (
    SELECT
      "T3"."C1" "C1"
    FROM
      "TESTUSER"."T3" "T3")
  AND "T1"."C1"="T2"."C1"(+)

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T3" "T3",
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"="T2"."C1"(+)
  AND "T1"."C1"="T3"."C1" 

If you are looking at the above, trying to learn alternate SQL syntax that may be more efficient than what you have used in the past, you should note that the last of the above SQL statements is technically not logically equivalent to the submitted SQL statement due to the potential for duplicate values in column C1 in table T3 (also note that the first SQL statement is missing the IN clause).  However, the optimizer in 11.2.0.1 understands how to handle this join as a semi-join, and will stop the join to the second table when the first matching row is found.

In the WATCH_TRANSFORM2 trace file, we see the NOT IN version of the SQL statement:

SELECT
  "T1"."C1" "QCSJ_C000000000300000",
  "T1"."PADDING" "QCSJ_C000000000300002",
  "from$_subquery$_005"."C1_1" "QCSJ_C000000000300001",
  "from$_subquery$_005"."PADDING_0" "QCSJ_C000000000300003"
FROM
  "TESTUSER"."T1" "T1",
  LATERAL(
    (SELECT
       "T2"."PADDING" "PADDING_0",
       "T2"."C1" "C1_1"
     FROM
       "TESTUSER"."T2" "T2"
     WHERE
       "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005"

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"<>ALL (
    SELECT
      "T3"."C1"*2 "T3.C1*2"
    FROM "TESTUSER"."T3" "T3")
  AND "T1"."C1"="T2"."C1"(+)

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T3" "T3",
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"="T3"."C1"*2
  AND "T1"."C1"="T2"."C1"(+) 

If you are looking at the above, trying to learn alternate SQL syntax that may be more efficient than what you have used in the past, you should note that the last of the above SQL statements is technically not logically equivalent to the submitted SQL statement due to the potential for duplicate values in column C1 in table T3.  However, the optimizer in 11.2.0.1 understands how to handle this join as a anti-join, and will stop the join to the second table when the first matching row is found.

Let’s try again with a change to the OPTIMIZER_FEATURES_ENABLE parameter:

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='8.1.7';
ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM3';

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 IN (
    SELECT
      T3.C1
    FROM
      T3);

ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM4';

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 NOT IN (
    SELECT
      T3.C1*2
    FROM
      T3); 

Now you might see something like this in the WATCH_TRANSFORM3 trace file (the IN syntax):

SELECT
  "T1"."C1" "QCSJ_C000000000300000",
  "T1"."PADDING" "QCSJ_C000000000300002",
  "T2"."C1" "C1",
  "T2"."PADDING" "PADDING"
FROM
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"="T2"."C1"(+)

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"=ANY (
    SELECT
      "T3"."C1" "C1"
    FROM
      "TESTUSER"."T3" "T3")
      AND "T1"."C1"="T2"."C1"(+)

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  (SELECT DISTINCT
     "T3"."C1" "C1"
   FROM
     "TESTUSER"."T3" "T3") "VW_NSO_1",
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"="VW_NSO_1"."C1"
  AND "T1"."C1"="T2"."C1"(+) 

Notice in the above that final version of the SQL statement shows the IN clause transformed into an inline view.

In the WATCH_TRANSFORM4 trace file you might see something like this (the NOT IN syntax):

SELECT
  "T1"."C1" "QCSJ_C000000000300000",
  "T1"."PADDING" "QCSJ_C000000000300002",
  "T2"."C1" "C1",
  "T2"."PADDING" "PADDING"
FROM
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"="T2"."C1"(+)

SELECT
  "T1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T1" "T1",
  "TESTUSER"."T2" "T2"
WHERE
  "T1"."C1"<>ALL (
    SELECT
      "T3"."C1"*2 "T3.C1*2"
    FROM
      "TESTUSER"."T3" "T3")
  AND "T1"."C1"="T2"."C1"(+)

SELECT
  "SYS_ALIAS_1"."PADDING" "T1_PADDING",
  "T2"."PADDING" "T2_PADDING"
FROM
  "TESTUSER"."T1" "SYS_ALIAS_1",
  "TESTUSER"."T2" "T2"
WHERE
  NOT EXISTS (
    SELECT
      0
    FROM
      "TESTUSER"."T3" "T3"
    WHERE
      "T3"."C1"*2="SYS_ALIAS_1"."C1")
  AND "SYS_ALIAS_1"."C1"="T2"."C1"(+) 

Notice in the above, the odd last transformation.  Oracle Database 8i is not supposed to be able to transform a SQL statement using a NOT IN clause into a SQL statement using a NOT EXISTS clause, even though it might be more efficient.

Just for fun, let’s see what happens when we add a NO_QUERY_TRANSFORMATION hint after the SELECT in the SQL statements while leaving the modified OPTIMIZER_FEATURES_ENABLE parameter set:

ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM5';

SELECT /*+ NO_QUERY_TRANSFORMATION */
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 IN (
    SELECT
      T3.C1
    FROM
      T3);

ALTER SESSION SET TRACEFILE_IDENTIFIER='WATCH_TRANSFORM6';

SELECT /*+ NO_QUERY_TRANSFORMATION */
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1 LEFT JOIN T2
    ON T1.C1=T2.C1
WHERE
  T1.C1 NOT IN (
    SELECT
      T3.C1*2
    FROM
      T3); 

The WATCH_TRANSFORM5 trace file (IN syntax) might contain SQL statements that look something like this:

SELECT
  "T1"."C1" "QCSJ_C000000000300000",
  "T1"."PADDING" "QCSJ_C000000000300002",
  "from$_subquery$_005"."C1_1" "C1",
  "from$_subquery$_005"."PADDING_0" "PADDING"
FROM
  "TESTUSER"."T1" "T1",
   LATERAL(
     (SELECT
        "T2"."PADDING" "PADDING_0",
        "T2"."C1" "C1_1"
      FROM
        "TESTUSER"."T2" "T2"
      WHERE
        "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005"

SELECT
  "SYS_ALIAS_1"."QCSJ_C000000000300002_1" "T1_PADDING",
  "SYS_ALIAS_1"."PADDING_3" "T2_PADDING"
FROM
  (SELECT
     "T1"."C1" "QCSJ_C000000000300000_0",
     "T1"."PADDING" "QCSJ_C000000000300002_1",
     "from$_subquery$_005"."C1_1" "C1",
     "from$_subquery$_005"."PADDING_0" "PADDING_3"
   FROM
     "TESTUSER"."T1" "T1",
     LATERAL(
       (SELECT
          "T2"."PADDING" "PADDING_0",
          "T2"."C1" "C1_1"
        FROM
          "TESTUSER"."T2" "T2"
        WHERE
          "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005") "SYS_ALIAS_1"
   WHERE
     EXISTS (
       SELECT
         0
       FROM
         "TESTUSER"."T3" "T3"
       WHERE
         "T3"."C1"="SYS_ALIAS_1"."QCSJ_C000000000300000_0") 

The last of the above SQL statements might be manually cleaned up to look like this:

SELECT
  T1.PADDING T1_PADDING,
  T2.PADDING T2_PADDING
FROM
  T1,
  T2
WHERE
  T1.C1=T2.C1(+)
  AND EXISTS (
    SELECT
      0
    FROM
      T3
    WHERE
      T3.C1=T1.C1); 

Notice in the above, that again the optimizer has transformed the IN syntax into EXISTS syntax, even though Oracle Database 8i did not support that automatic transformation (and we did specifically request no query transformations through the use of the hint).

The WATCH_TRANSFORM6 trace file (NOT IN syntax) might contain SQL statements that look something like this:

SELECT
  "T1"."C1" "QCSJ_C000000000300000",
  "T1"."PADDING" "QCSJ_C000000000300002",
  "from$_subquery$_005"."C1_1" "C1",
  "from$_subquery$_005"."PADDING_0" "PADDING"
FROM
  "TESTUSER"."T1" "T1",
  LATERAL(
    (SELECT
       "T2"."PADDING" "PADDING_0",
       "T2"."C1" "C1_1"
     FROM
       "TESTUSER"."T2" "T2"
     WHERE
       "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005"

SELECT
  "SYS_ALIAS_1"."QCSJ_C000000000300002_1" "T1_PADDING",
  "SYS_ALIAS_1"."PADDING_3" "T2_PADDING"
FROM
  (SELECT
     "T1"."C1" "QCSJ_C000000000300000_0",
     "T1"."PADDING" "QCSJ_C000000000300002_1",
     "from$_subquery$_005"."C1_1" "C1",
     "from$_subquery$_005"."PADDING_0" "PADDING_3"
   FROM
     "TESTUSER"."T1" "T1",
     LATERAL(
       (SELECT
          "T2"."PADDING" "PADDING_0",
          "T2"."C1" "C1_1"
        FROM
          "TESTUSER"."T2" "T2"
        WHERE
          "T1"."C1"="T2"."C1"))(+) "from$_subquery$_005") "SYS_ALIAS_1"
WHERE
  NOT EXISTS (
    SELECT
      0
    FROM
      "TESTUSER"."T3" "T3"
    WHERE
      "T3"."C1"*2="SYS_ALIAS_1"."QCSJ_C000000000300000_0") 

Another NOT IN to NOT EXISTS transformation.

—-

In all of the above cases, the first transformed SQL statement printed in the 10053 trace file excluded the IN and the NOT IN clauses, but did include the ANSI to Oracle specific join syntax conversion.

It is nice to know that Oracle Corp. throws in something free with Oracle Database: a free ANSI SQL to Oracle Specific SQL translator, and a free SQL tutor that teaches rewriting SQL statements into alternate SQL syntax that just might be more efficient than the original SQL statement (you might be able to use the more efficient version with older Oracle Database releases, as was the case for the NOT IN to TO EXISTS transformation).








Follow

Get every new post delivered to your Inbox.

Join 144 other followers