The Unique Result Oracle Database Coding Challenge

28 07 2011

July 28, 2011

I must say that I am impressed with the number of unique solutions that were developed for the previous coding challenge (FizzBuzz).  While not all solutions were extremely efficient (a couple were intentionally designed to be as inefficient as possible), the various techniques provide views of different approaches to solving a problem that was not well defined (even though at first glance it appeared to be well defined).  While not all of the solutions presented are optimal for the FizzBuzz challenge, derivatives of those solutions might be perfect for real-world problems (for instance, side-tracking a pesky DBA or developer with a performance challenge).

Time for another coding challenge.  This challenge is adapted from one that was posed in a Usenet thread several years ago.  Remember that if a specification is not well defined, feel free to interpret the specification – one of the goals of this blog article is to see a variety of solutions, but a secondary goal is to determine what might happen when weak specifications are provided in a request for a solution.

Assume that there are two tables, T1 and T2, as designed and populated below.  The two tables contain the results from experiments performed by two different analysts.

DROP TABLE T1 PURGE;
DROP TABLE T2 PURGE;

CREATE TABLE T1 (
  C1 VARCHAR2(10),
  C2 VARCHAR2(10),
  C3 VARCHAR2(10),
  C4 VARCHAR2(10));

CREATE TABLE T2 (
  C1 VARCHAR2(10),
  C2 VARCHAR2(10),
  C3 VARCHAR2(10),
  C4 VARCHAR2(10));

INSERT INTO T1 VALUES ('TEST1A','TEST1B','TEST1C','TEST1D');
INSERT INTO T1 VALUES ('TEST2A','TEST2B','TEST2C','TEST2D');
INSERT INTO T1 VALUES ('TEST3A','TEST3B','TEST3C','TEST3D');
INSERT INTO T1 VALUES ('TEST4A','TEST4B','TEST4C','TEST4D');
INSERT INTO T1 VALUES ('TEST5A','TEST5B','TEST5C','TEST5D');

INSERT INTO T2 VALUES ('TEST1A','TEST1B','TEST1C','TEST1D');
INSERT INTO T2 VALUES ('TEST2A','TEST1B','TEST1C','TEST2D');
INSERT INTO T2 VALUES ('TEST3A','TEST3B','TEST3C','TEST3D');
INSERT INTO T2 VALUES ('TEST5A','TEST5B','TEST5C','TEST5D');
INSERT INTO T2 VALUES ('TEST6A','TEST5B','TEST5C','TEST6D');  

This is a three part challenge.

Part 1: Return all rows from table T1 where the C2 value is not found in table T2’s column C2 values, and return all rows from table T2 where the C2 value is not found in table T1’s column C2 values.  In short, we want the rows that have unique results found in column C2 so that we may identify the differences found by the two analysts.

Part 2: Using a single select statement, return all rows found in table T1 that are not duplicated in table T2, and all rows found in table T2 that are not duplicated in table T1.  The resulting output must identify the table from which the row was found.

Part 3: A third analyst performed experiments and entered his results in table T3, as designed and populated below.  Using a single SQL statement, retrieve the rows from tables T1, T2, and T3 that are not duplicated in the other tables – find the unique rows from the three tables.  Each row should identify its source table (T1, T2, or T3).

DROP TABLE T3 PURGE;

CREATE TABLE T3 (
  C1 VARCHAR2(10),
  C2 VARCHAR2(10),
  C3 VARCHAR2(10),
  C4 VARCHAR2(10));

INSERT INTO T3 VALUES ('TEST1A','TEST1B','TEST1C','TEST1D');
INSERT INTO T3 VALUES ('TEST2A','TEST1B','TEST1C','TEST2D');
INSERT INTO T3 VALUES ('TEST6A','TEST6B','TEST6C','TEST6D'); 

When you post your solutions, please identify Part 1, Part 2, or Part 3 next to the solution.

General tips for comments:

  • Include your code portion of the solution in a monospaced font (Courier) to retain the alignment spaces in the solution.  To do this, use a <pre> tag just before the code portion of the solution and a </pre> tag just after the solution ([s ourcecode] and [/s ourcecode] tags (without the space between the first two letters) should have the same result, just with a smaller font size and a non-scrolling code area).
  • Less than (<) and greater than (>) signs have a special meaning in HTML web pages.  As such, specify &lt; for a less than sign and &gt; for a greater than sign in code sections to avoid having portions of your code sections magically disappear.

Think about how you would build the solutions before scrolling down to the comments section.

Part 2:

SELECT  NVL(T1.C1,T2.C1) C1, NVL(T1.C2,T2.C2) C2, NVL(T1.C3,T2.C3) C3, NVL(T1.C4,T2.C4) C4, NVL2(T1.C1,'T1','T2') FROM_TABLE FROM T1 FULL OUTER JOIN T2 ON T1.C1=T2.C1 AND T1.C2=T2.C2 AND T1.C3=T2.C3 AND T1.C4=T2.C4 WHERE T1.C1 IS NULL OR T2.C1 IS NULL;