February 26, 2011
In a recent OTN thread in the SQL forum I found a SQL statement (see the thread for the original SQL statement) that the original poster (OP) wanted to transform into full outer join syntax, but was unsure how to start the transformation. To get started with helping the OP, you first need to determine what is the intended result of the SQL statement. Second, you need to determine if the query really is producing the intended result. Third, you need to determine why the OP wants to change a functioning query into a different syntax – is the OP facing a performance problem, or is the OP expected to provide an answer as part of a test question for a job interview or classroom project ;-). Fourth, you need to determine the Oracle Database release version that is available to the OP – can the OP use the FULL OUTER JOIN syntax, or is it necessary to use the Oracle specific join syntax?
The OP’s query essentially had the following format:
SELECT C1 FROM T1@SOMEDBLINK WHERE C2='26-FEB-2011' MINUS SELECT C1 FROM T1 WHERE C2='26-FEB-2011' UNION SELECT C1 FROM T1 WHERE C2='26-FEB-2011' MINUS SELECT C1 FROM T1@SOMEDBLINK WHERE C2='26-FEB-2011';
Before testing, it appeared that the OP wanted all of the rows in each of the two tables, if and only if, the rows do not exist in both tables. So, this is a little like a UNION ALL minus the intersection of the two row sources (assuming that there are no duplicate values in either table). Seems simple enough, until you test it. Let’s build a simple model to explain what is happening. We will create two tables with 10 rows each. The first table has C1 values that increase by 2, while the second has C1 values that increase by 3:
CREATE TABLE T1 AS SELECT ROWNUM*2 C1, TRUNC(SYSDATE)+ROWNUM*2 C2 FROM DUAL CONNECT BY LEVEL<=10; CREATE TABLE T2 AS SELECT ROWNUM*3 C1, TRUNC(SYSDATE)+ROWNUM*3 C2 FROM DUAL CONNECT BY LEVEL<=10;
Now let’s look at the simulated two halves of the OP’s query:
SELECT C1, C2 FROM T1 MINUS SELECT C1, C2 FROM T2; C1 C2 -- --------- 2 28-FEB-11 4 02-MAR-11 8 06-MAR-11 10 08-MAR-11 14 12-MAR-11 16 14-MAR-11 20 18-MAR-11 SELECT C1, C2 FROM T2 MINUS SELECT C1, C2 FROM T1; C1 C2 -- --------- 3 01-MAR-11 9 07-MAR-11 15 13-MAR-11 21 19-MAR-11 24 22-MAR-11 27 25-MAR-11 30 28-MAR-11
As can be seen by the above, each half returned 7 rows – there are 7 rows in each table that is not in the other table. Common sense would state that if we UNION these two results (assuming no duplicate values in each table), we would see 14 rows:
SELECT C1, C2 FROM T1 MINUS SELECT C1, C2 FROM T2 UNION SELECT C1, C2 FROM T2 MINUS SELECT C1, C2 FROM T1; C1 C2 -- --------- 3 01-MAR-11 9 07-MAR-11 15 13-MAR-11 21 19-MAR-11 24 22-MAR-11 27 25-MAR-11 30 28-MAR-11
Only 7 rows? Let’s try again with the help of inline views to control the order in which the MINUS and UNION operators are processed:
SELECT * FROM (SELECT C1, C2 FROM T1 MINUS SELECT C1, C2 FROM T2) UNION SELECT * FROM (SELECT C1, C2 FROM T2 MINUS SELECT C1, C2 FROM T1); C1 C2 -- --------- 2 28-FEB-11 3 01-MAR-11 4 02-MAR-11 8 06-MAR-11 9 07-MAR-11 10 08-MAR-11 14 12-MAR-11 15 13-MAR-11 16 14-MAR-11 20 18-MAR-11 21 19-MAR-11 24 22-MAR-11 27 25-MAR-11 30 28-MAR-11
Note that the above returned 14 rows, as common sense seemed to imply the previous query would return. We can return the same result set using just two outer joins:
SELECT T1.C1, T1.C2 FROM T1, T2 WHERE T1.C1=T2.C1(+) AND T2.C1 IS NULL UNION SELECT T2.C1, T2.C2 FROM T1, T2 WHERE T2.C1=T1.C1(+) AND T1.C1 IS NULL; C1 C2 -- --------- 2 28-FEB-11 3 01-MAR-11 4 02-MAR-11 8 06-MAR-11 9 07-MAR-11 10 08-MAR-11 14 12-MAR-11 15 13-MAR-11 16 14-MAR-11 20 18-MAR-11 21 19-MAR-11 24 22-MAR-11 27 25-MAR-11 30 28-MAR-11
Or we can produce the same output with a FULL OUTER JOIN and a MINUS operator:
SELECT NVL(T1.C1,T2.C1) C1, NVL2(T1.C1,T1.C2,T2.C2) C2 FROM T1 FULL OUTER JOIN T2 ON T1.C1=T2.C1 MINUS SELECT T1.C1, T1.C2 FROM T1, T2 WHERE T1.C1=T2.C1; C1 C2 -- --------- 2 28-FEB-11 3 01-MAR-11 4 02-MAR-11 8 06-MAR-11 9 07-MAR-11 10 08-MAR-11 14 12-MAR-11 15 13-MAR-11 16 14-MAR-11 20 18-MAR-11 21 19-MAR-11 24 22-MAR-11 27 25-MAR-11 30 28-MAR-11
We are also able to produce the output with a FULL OUTER JOIN with a WHERE clause:
SELECT NVL(T1.C1,T2.C1) C1, NVL2(T1.C1,T1.C2,T2.C2) C2 FROM T1 FULL OUTER JOIN T2 ON T1.C1=T2.C1 WHERE (T1.C1 IS NULL OR T2.C1 IS NULL); C1 C2 -- --------- 3 01-MAR-11 9 07-MAR-11 15 13-MAR-11 21 19-MAR-11 24 22-MAR-11 27 25-MAR-11 30 28-MAR-11 8 06-MAR-11 20 18-MAR-11 2 28-FEB-11 10 08-MAR-11 4 02-MAR-11 14 12-MAR-11 16 14-MAR-11
We still have not determined everything that is outlined at the start of this article. If the OP is looking for an exact answer so that he may complete a test question for a job interview or classroom project, directly fixing his posted SQL statement may prove unhelpful in the long run. Considering that we do not know the OP’s Oracle Database version, it is very difficult to say “use this for your full outer join requirement“. In a recent article I linked to the SQL and PL/SQL FAQ on the OTN forum – it probably would have been helpful had the OP read that FAQ in full before posting a question to the forum, so that the OP would have known that the Oracle Database release version is occasionally a very important detail when seeking help.
The fact that the OP’s query is apparently relying on implicit data type conversions filecreatedt=’18-feb-2011′ is a little troubling, especially if this question is found on some sort of test.
Leave a Reply