Feeling ANSI About Oracle Join Syntax? 2

7 02 2013

February 7, 2013

(Back to the Previous Post in the Series)

As I have mentioned a couple of times previously, I am not much of a fan of ANSI style joins – I prefer using the classical Oracle join syntax when possible.  I try to keep up with an ERP mailing list, and try to assist with providing answers to questions when time permits.  A SQL statement was recently shared with the ERP mailing list, demonstrating a solution that was put together to solve a particular problem.  A portion of the SQL statement follows:

...
 FROM
 dbo.WORK_ORDER RIGHT OUTER JOIN dbo.OPERATION ON
     dbo.OPERATION.WORKORDER_TYPE = dbo.WORK_ORDER.TYPE AND
     dbo.OPERATION.WORKORDER_BASE_ID = dbo.WORK_ORDER.BASE_ID AND
     dbo.OPERATION.WORKORDER_LOT_ID = dbo.WORK_ORDER.LOT_ID AND
     dbo.OPERATION.WORKORDER_SPLIT_ID = dbo.WORK_ORDER.SPLIT_ID AND
     dbo.OPERATION.WORKORDER_SUB_ID = dbo.WORK_ORDER.SUB_ID
   left join LABOR_TICKET CUR on
 ...

An ANSI join… I really wish that the person who wrote that SQL statement used Oracle’s classical (+) notation for the outer join declaration… and it probably would have helped if the OP was running with an Oracle Database backend rather than a SQL Server backend.  When I saw that ANSI outer join, I immediately started thinking about pig outer join… a response that I submitted to an Internet forum in 2006, and republished here.

After thinking about the SQL statement for a couple of minutes, I decided that the above ANSI RIGHT OUTER JOIN is equivalent to the following using Oracle’s classical (+) notation for outer joins:

WORK_ORDER.TYPE(+) = OPERATION.WORKORDER_TYPE
AND WORK_ORDER.BASE_ID(+) = OPERATION.WORKORDER_BASE_ID
AND WORK_ORDER.LOT_ID(+) = OPERATION.WORKORDER_LOT_ID
AND WORK_ORDER.SPLIT_ID(+) = OPERATION.WORKORDER_SPLIT_ID
AND WORK_ORDER.SUB_ID(+) = OPERATION.WORKORDER_SUB_ID

If you look closely at the above, you might have noticed that not only did I remove the dbo. in front of the table names, but I also switched which columns appear on the left and right side of the equal sign.  Why switch which column is on the left and which is on the right of the equal sign?  So that the columns belonging to each table were listed in the same order from left to right as the join declaration: dbo.WORK_ORDER RIGHT OUTER JOIN dbo.OPERATION

With that change made, the rules of pig outer join apply.  With the ANSI join type of RIGHT OUTER JOIN, the (+) notation is placed after the column that is to the left of the equal sign.  That column with the (+) notation appended is permitted to return a NULL value when there is no matching row in that column’s table for that join condition.  The (+) notation follows all of the columns from the WORK_ORDER table that appear in the join.  To the casual observer, one might think that there is a parent-child relationship between the two tables, maybe even a delared foreign key relationship with the OPERATION table as the parent and the WORK_ORDER table as the child.

There is in fact a declared foreign key relationship between the two tables.  For Oracle Database backends, that declared foreign key relationship was created using a command similar to the following:

ALTER TABLE OPERATION ADD(
  CONSTRAINT CONFUSE_ME_NOT FOREIGN KEY (
    WORKORDER_TYPE,
    WORKORDER_BASE_ID,
    WORKORDER_LOT_ID,
    WORKORDER_SPLIT_ID,
    WORKORDER_SUB_ID)
  REFERENCES WORK_ORDER ON DELETE CASCADE);

So yes, a delared foreign key relationship exists between the two tables.  But wait, doesn’t the above constraint require that for a row to be present in the OPERATION table, a corresponding row must already exist in the WORK_ORDER table?

Maybe I should not have flip-flopped which columns appear on the left and right side of the equal sign?  Did I mention that I am not much of a fan of ANSI style joins?  There might be a few bugs in Oracle Database related to its automatic conversion of ANSI style joins to classical Oracle joins, but let’s try a test anyway.

I will use the autotrace functionality in SQL*Plus to output the execution plan – there are times when autotrace outputs the wrong execution plan for a SQL statement, but we will ignore that quirk for now.  I will add a NO_QUERY_TRANSFORMATION hint to a much shortened version of the original poster’s (OP’s) original query – this hint was an attempt to keep Oracle’s query optimizer from recognizing that there is a declared foreign key relationship between the two tables, and automatically converting the outer join into an inner join (oddly, the Oracle query optimizer did not alter the join to an inner join when the hint was removed).

SET AUTOTRACE TRACEONLY EXPLAIN
SET LINESIZE 200
SET PAGESIZE 1000

SELECT /*+ NO_QUERY_TRANSFORMATION */
  *
FROM
  WORK_ORDER,
  OPERATION
WHERE
  WORK_ORDER.TYPE(+) = OPERATION.WORKORDER_TYPE
  AND WORK_ORDER.BASE_ID(+) = OPERATION.WORKORDER_BASE_ID
  AND WORK_ORDER.LOT_ID(+) = OPERATION.WORKORDER_LOT_ID
  AND WORK_ORDER.SPLIT_ID(+) = OPERATION.WORKORDER_SPLIT_ID
  AND WORK_ORDER.SUB_ID(+) = OPERATION.WORKORDER_SUB_ID
  AND OPERATION.WORKORDER_TYPE = 'W'
  AND OPERATION.WORKORDER_BASE_ID = '13000';

Execution Plan
----------------------------------------------------------
Plan hash value: 4262719396

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |    31 | 12772 |    21   (5)| 00:00:01 |
|*  1 |  HASH JOIN OUTER             |              |    31 | 12772 |    21   (5)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| OPERATION    |    31 |  6944 |    12   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0021734 |    31 |       |     3   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| WORK_ORDER   |    11 |  2068 |     8   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | SYS_C0021989 |    11 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("WORK_ORDER"."TYPE"(+)="OPERATION"."WORKORDER_TYPE" AND
              "WORK_ORDER"."BASE_ID"(+)="OPERATION"."WORKORDER_BASE_ID" AND
              "WORK_ORDER"."LOT_ID"(+)="OPERATION"."WORKORDER_LOT_ID" AND
              "WORK_ORDER"."SPLIT_ID"(+)="OPERATION"."WORKORDER_SPLIT_ID" AND
              "WORK_ORDER"."SUB_ID"(+)="OPERATION"."WORKORDER_SUB_ID")
   3 - access("OPERATION"."WORKORDER_TYPE"='W' AND
              "OPERATION"."WORKORDER_BASE_ID"='13000')
   5 - access("WORK_ORDER"."TYPE"(+)='W' AND "WORK_ORDER"."BASE_ID"(+)='13000')

For now, just focus on the Predicate Information section of the generated execution plan, in particular the information found in the “1 – access” section – that section shows how the two tables are joined together during the hash join operation.  Much of the WHERE clause from the shortened version of the query is found in the “1 – access” section (and there is an indication of an automatically generated predicate in the “5-access” section).

Let’s take a look at an ANSI join version of the shortened SQL statement, with the WORK_ORDER table listed first in the join syntax, the OPERATION table listed second in the join syntax (as it was in the OP’s SQL statement), and with the columns in the ON clause flip-flopped on each side of the = signs (the opposite order in which the columns were listed in the OP’s original SQL statement):

SELECT
  *
FROM
  WORK_ORDER RIGHT OUTER JOIN OPERATION
ON (WORK_ORDER.TYPE = OPERATION.WORKORDER_TYPE
    AND WORK_ORDER.BASE_ID = OPERATION.WORKORDER_BASE_ID
    AND WORK_ORDER.LOT_ID = OPERATION.WORKORDER_LOT_ID
    AND WORK_ORDER.SPLIT_ID = OPERATION.WORKORDER_SPLIT_ID
    AND WORK_ORDER.SUB_ID = OPERATION.WORKORDER_SUB_ID)
WHERE
  OPERATION.WORKORDER_TYPE = 'W'
  AND OPERATION.WORKORDER_BASE_ID = '13000';

Execution Plan
----------------------------------------------------------
Plan hash value: 4262719396

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |    31 | 12772 |    21   (5)| 00:00:01 |
|*  1 |  HASH JOIN OUTER             |              |    31 | 12772 |    21   (5)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| OPERATION    |    31 |  6944 |    12   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0021734 |    31 |       |     3   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| WORK_ORDER   |    11 |  2068 |     8   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | SYS_C0021989 |    11 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("WORK_ORDER"."SUB_ID"(+)="OPERATION"."WORKORDER_SUB_ID" AND
              "WORK_ORDER"."SPLIT_ID"(+)="OPERATION"."WORKORDER_SPLIT_ID" AND
              "WORK_ORDER"."LOT_ID"(+)="OPERATION"."WORKORDER_LOT_ID" AND
              "WORK_ORDER"."BASE_ID"(+)="OPERATION"."WORKORDER_BASE_ID" AND
              "WORK_ORDER"."TYPE"(+)="OPERATION"."WORKORDER_TYPE")
   3 - access("OPERATION"."WORKORDER_TYPE"='W' AND
              "OPERATION"."WORKORDER_BASE_ID"='13000')
   5 - access("WORK_ORDER"."TYPE"(+)='W' AND "WORK_ORDER"."BASE_ID"(+)='13000')

In the above, just focus on the Predicate Information section for a moment, in particular the information found in the “1 – access” section – that section shows how the two tables are joined together during the hash join operation.  Notice the location of the (+) symbols, immediately after the columns from the WORK_ORDER table – that arrangement exactly matches my Oracle style join syntax.  You might have also noticed that the plan hash value is 4262719396 in both of the execution plans (that plan hash value consistency does NOT verify that the Predicate Information section is identical for the two execution plans, but it does verify that otherwise the execution plans are identical).

So, the Oracle query optimizer esentially transformed the ANSI style join version of the SQL statement into the same (optimizer transformed) SQL statement as I submitted using the classical Oracle outer join syntax.  Nice, however, that still leaves a question.  Does it matter in ANSI style joins which column is to the left of the equal sign in the ON clause?

Let’s try the modified ANSI SQL statement again, this time with the tables listed in the same order of the join clause as in the OP’s SQL statement, and the columns in the same order of the ON clause as in the OP’s SQL statement:

SELECT
  *
FROM
  WORK_ORDER RIGHT OUTER JOIN OPERATION
ON (OPERATION.WORKORDER_TYPE = WORK_ORDER.TYPE
    AND OPERATION.WORKORDER_BASE_ID = WORK_ORDER.BASE_ID
    AND OPERATION.WORKORDER_LOT_ID = WORK_ORDER.LOT_ID
    AND OPERATION.WORKORDER_SPLIT_ID = WORK_ORDER.SPLIT_ID
    AND OPERATION.WORKORDER_SUB_ID = WORK_ORDER.SUB_ID)
WHERE
  OPERATION.WORKORDER_TYPE = 'W'
  AND OPERATION.WORKORDER_BASE_ID = '13000';

Execution Plan
----------------------------------------------------------
Plan hash value: 4262719396

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |    31 | 12772 |    21   (5)| 00:00:01 |
|*  1 |  HASH JOIN OUTER             |              |    31 | 12772 |    21   (5)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| OPERATION    |    31 |  6944 |    12   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0021734 |    31 |       |     3   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| WORK_ORDER   |    11 |  2068 |     8   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | SYS_C0021989 |    11 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("OPERATION"."WORKORDER_SUB_ID"="WORK_ORDER"."SUB_ID"(+) AND
              "OPERATION"."WORKORDER_SPLIT_ID"="WORK_ORDER"."SPLIT_ID"(+) AND
              "OPERATION"."WORKORDER_LOT_ID"="WORK_ORDER"."LOT_ID"(+) AND
              "OPERATION"."WORKORDER_BASE_ID"="WORK_ORDER"."BASE_ID"(+) AND
              "OPERATION"."WORKORDER_TYPE"="WORK_ORDER"."TYPE"(+))
   3 - access("OPERATION"."WORKORDER_TYPE"='W' AND
              "OPERATION"."WORKORDER_BASE_ID"='13000')
   5 - access("WORK_ORDER"."TYPE"(+)='W' AND "WORK_ORDER"."BASE_ID"(+)='13000')

You might have noticed that once again, the plan hash value is 4262719396, just as it was for the previous two execution plans.  What does that plan hash value consistency indicate again?  In the above, focus on the Predicate Information section for a moment, in particular the “1 – access” section.  Notice again that the (+) symbols appear immediately after the columns from the WORK_ORDER table, even though the column order, in respect to the = sign, mirrors that of the submitted SQL statement (Oracle’s query optimizer is permitted to flip-flop the columns that appear on each side of the = sign, however that rearrangement did not happen during this demonstration).

So, what have we learned from the above information?  Are you feeling ANSI yet?


Actions

Information

5 responses

7 02 2013
mgogala

Excellent article, Charles!
Thanks!

7 02 2013
Charles Hooper

Mladen,

Thanks for the comment. I thought that someone might raise a concern with me using using Oracle Database, SQL*Plus, and autotrace to investigate a SQL Server join issue, that in this particular case is not an actual issue. 😉

I did find it slightly interesting that Oracle Database 11.2.0.2’s optimizer did not transform the outer join into an equijoin due to the declared foreign key. There is probably a good reason for the missing opportunity for an additional optimization, but for right now I am OK with just considering it interesting.

8 02 2013
Martin Preiss

Hi Charles,

thank you for the interesting analysis. I thought about adding some comments, but then I saw that all I could say about ANSI join syntax (and readability and bugs) was already included in the first article (from 2010) of your series (and in the comments of Tim Hall and William Robertson)…

Regards

Martin

P.S. for FULL OUTER JOINs I prefer the ANSI version, but in most cases I am quite indifferent …

8 02 2013
Charles Hooper

Martin,

Thank you for pointing out the discussion in the earlier article. I agree, for full outer joins, the ANSI syntax is much more clear than the equivalent SQL statement using Oracle’s classical outer joins.

I attempted to make a couple of points in this article… identifying an apparent ANSI bug as actually not an ANSI bug, but instead a programming bug is just one of those points (does the “RIGHT OUTER” apply to the.order of the columns in the “ON” clause or the order of the row sources in the “JOIN” clause?). I am sure that this is an easy mistake to make, especially when trying to manually translate Oracle’s classical outer joins to work on another database platform, such as SQL Server.

8 02 2013
Latest data Industry news round up, Log Buffer #306

[…] Charles Hooper has a feeling ANSI About Oracle Join Syntax? […]

Leave a comment