SQL Precedence UNION and MINUS, Converting to Full Outer Join Syntax

26 02 2011

February 26, 2011

In a recent OTN thread in the SQL forum I found a SQL statement (see the thread for the original SQL statement) that the original poster (OP) wanted to transform into full outer join syntax, but was unsure how to start the transformation.  To get started with helping the OP, you first need to determine what is the intended result of the SQL statement.  Second, you need to determine if the query really is producing the intended result.  Third, you need to determine why the OP wants to change a functioning query into a different syntax – is the OP facing a performance problem, or is the OP expected to provide an answer as part of a test question for a job interview or classroom project ;-).  Fourth, you need to determine the Oracle Database release version that is available to the OP – can the OP use the FULL OUTER JOIN syntax, or is it necessary to use the Oracle specific join syntax?

The OP’s query essentially had the following format:

SELECT
  C1
FROM
  T1@SOMEDBLINK
WHERE
  C2='26-FEB-2011'
MINUS
SELECT
  C1
FROM
  T1
WHERE
  C2='26-FEB-2011'

UNION

SELECT
  C1
FROM
  T1
WHERE
  C2='26-FEB-2011'
MINUS
SELECT
  C1
FROM
  T1@SOMEDBLINK
WHERE
  C2='26-FEB-2011'; 

Before testing, it appeared that the OP wanted all of the rows in each of the two tables, if and only if, the rows do not exist in both tables.  So, this is a little like a UNION ALL minus the intersection of the two row sources (assuming that there are no duplicate values in either table).  Seems simple enough, until you test it.  Let’s build a simple model to explain what is happening.  We will create two tables with 10 rows each.  The first table has C1 values that increase by 2, while the second has C1 values that increase by 3:

CREATE TABLE T1 AS
SELECT
  ROWNUM*2 C1,
  TRUNC(SYSDATE)+ROWNUM*2 C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10;

CREATE TABLE T2 AS
SELECT
  ROWNUM*3 C1,
  TRUNC(SYSDATE)+ROWNUM*3 C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10; 

Now let’s look at the simulated two halves of the OP’s query:

SELECT
  C1,
  C2
FROM
  T1
MINUS
SELECT
  C1,
  C2
FROM
  T2;

C1 C2
-- ---------
 2 28-FEB-11
 4 02-MAR-11
 8 06-MAR-11
10 08-MAR-11
14 12-MAR-11
16 14-MAR-11
20 18-MAR-11

SELECT
  C1,
  C2
FROM
  T2
MINUS
SELECT
  C1,
  C2
FROM
  T1;

C1 C2
-- ---------
 3 01-MAR-11
 9 07-MAR-11
15 13-MAR-11
21 19-MAR-11
24 22-MAR-11
27 25-MAR-11
30 28-MAR-11 

As can be seen by the above, each half returned 7 rows – there are 7 rows in each table that is not in the other table.  Common sense would state that if we UNION these two results (assuming no duplicate values in each table), we would see 14 rows:

SELECT
  C1,
  C2
FROM
  T1
MINUS
SELECT
  C1,
  C2
FROM
  T2
UNION
SELECT
  C1,
  C2
FROM
  T2
MINUS
SELECT
  C1,
  C2
FROM
  T1;

C1 C2
-- ---------
 3 01-MAR-11
 9 07-MAR-11
15 13-MAR-11
21 19-MAR-11
24 22-MAR-11
27 25-MAR-11
30 28-MAR-11 

Only 7 rows?  Let’s try again with the help of inline views to control the order in which the MINUS and UNION operators are processed:

SELECT
  *
FROM
  (SELECT
    C1,
    C2
  FROM
    T1
  MINUS
  SELECT
    C1,
    C2
  FROM
    T2)
UNION
SELECT
  *
FROM
  (SELECT
    C1,
    C2
  FROM
    T2
  MINUS
  SELECT
    C1,
    C2
  FROM
    T1);

C1 C2
-- ---------
 2 28-FEB-11
 3 01-MAR-11
 4 02-MAR-11
 8 06-MAR-11
 9 07-MAR-11
10 08-MAR-11
14 12-MAR-11
15 13-MAR-11
16 14-MAR-11
20 18-MAR-11
21 19-MAR-11
24 22-MAR-11
27 25-MAR-11
30 28-MAR-11 

Note that the above returned 14 rows, as common sense seemed to imply the previous query would return.  We can return the same result set using just two outer joins:

SELECT
  T1.C1,
  T1.C2
FROM
  T1,
  T2
WHERE
  T1.C1=T2.C1(+)
  AND T2.C1 IS NULL
UNION
SELECT
  T2.C1,
  T2.C2
FROM
  T1,
  T2
WHERE
  T2.C1=T1.C1(+)
  AND T1.C1 IS NULL;

C1 C2
-- ---------
 2 28-FEB-11
 3 01-MAR-11
 4 02-MAR-11
 8 06-MAR-11
 9 07-MAR-11
10 08-MAR-11
14 12-MAR-11
15 13-MAR-11
16 14-MAR-11
20 18-MAR-11
21 19-MAR-11
24 22-MAR-11
27 25-MAR-11
30 28-MAR-11 

Or we can produce the same output with a FULL OUTER JOIN and a MINUS operator:

SELECT
  NVL(T1.C1,T2.C1) C1,
  NVL2(T1.C1,T1.C2,T2.C2) C2
FROM
  T1 FULL OUTER JOIN T2
    ON T1.C1=T2.C1
MINUS
SELECT
  T1.C1,
  T1.C2
FROM
  T1,
  T2
WHERE
  T1.C1=T2.C1;

C1 C2
-- ---------
 2 28-FEB-11
 3 01-MAR-11
 4 02-MAR-11
 8 06-MAR-11
 9 07-MAR-11
10 08-MAR-11
14 12-MAR-11
15 13-MAR-11
16 14-MAR-11
20 18-MAR-11
21 19-MAR-11
24 22-MAR-11
27 25-MAR-11
30 28-MAR-11 

We are also able to produce the output with a FULL OUTER JOIN with a WHERE clause:

SELECT
  NVL(T1.C1,T2.C1) C1,
  NVL2(T1.C1,T1.C2,T2.C2) C2
FROM
  T1 FULL OUTER JOIN T2
    ON T1.C1=T2.C1
WHERE
  (T1.C1 IS NULL
    OR T2.C1 IS NULL);

C1 C2
-- ---------
 3 01-MAR-11
 9 07-MAR-11
15 13-MAR-11
21 19-MAR-11
24 22-MAR-11
27 25-MAR-11
30 28-MAR-11
 8 06-MAR-11
20 18-MAR-11
 2 28-FEB-11
10 08-MAR-11
 4 02-MAR-11
14 12-MAR-11
16 14-MAR-11 

We still have not determined everything that is outlined at the start of this article.  If the OP is looking for an exact answer so that he may complete a test question for a job interview or classroom project, directly fixing his posted SQL statement may prove unhelpful in the long run.  Considering that we do not know the OP’s Oracle Database version, it is very difficult to say “use this for your full outer join requirement“.  In a recent article I linked to the SQL and PL/SQL FAQ on the OTN forum – it probably would have been helpful had the OP read that FAQ in full before posting a question to the forum, so that the OP would have known that the Oracle Database release version is occasionally a very important detail when seeking help. 

The fact that the OP’s query is apparently relying on implicit data type conversions filecreatedt=’18-feb-2011′ is a little troubling, especially if this question is found on some sort of test.








Follow

Get every new post delivered to your Inbox.

Join 143 other followers