How Many Ways to Solve this SQL Problem?

6 07 2011

July 6, 2011

Since there were so many unique solutions to the last blog article that posed a SQL challenge, I thought that I would try another blog article that asks a similar type of question.  Assume that someone showed you the following output:

 C2   D
--- ---
100   0
150  50
200  50
201   1
300  99
350  50
400  50
500 100 

You have the following table definition, and rows in the table:

CREATE TABLE T2 (
  C1 NUMBER,
  C2 NUMBER);

INSERT INTO T2 VALUES (1,100);
INSERT INTO T2 VALUES (4,150);
INSERT INTO T2 VALUES (7,200);
INSERT INTO T2 VALUES (8,201);
INSERT INTO T2 VALUES (10,300);
INSERT INTO T2 VALUES (14,350);
INSERT INTO T2 VALUES (18,400);
INSERT INTO T2 VALUES (24,500);

COMMIT;

Assume that you know nothing other than the fact that the C2 values are listed in ascending order when sorted by column C1.  How many different ways can this particular problem be solved.  Yes, there is an easy way, but assume that you were trying to “help educate” the person who provided the requested output.

My least-shortest-path solution follows:

SELECT
  C2,
  0 D
FROM
  T2
WHERE
  C1=(SELECT
        MIN(C1)
      FROM
        T2)
UNION ALL
SELECT
  V2.C2,
  V2.C2-MAX(T2.C2) D
FROM
  T2,
  (SELECT
    C1,
    C2
  FROM
    T2) V2
WHERE
  T2.C1<V2.C1
GROUP BY
  V2.C2
ORDER BY
  C2;

  C2    D
---- ----
 100    0
 150   50
 200   50
 201    1
 300   99
 350   50
 400   50
 500  100

8 rows selected. 

In the above, the row with the 0 in the D column was the hardest part of the solution.  Why would I use UNION ALL and not UNION – what was not in the specification?

This blog article was inspired by an old question found in a Usenet group from 1998 – if you were answering the question in 1998, would your answer be any different?  Be creative with your solution.  While you are thinking about a solution, take a look at this old Usenet thread and consider how difficult it was to find the “50 highest paid workers” in the last century.





The CURSOR_SHARING Parameter is a Silver Bullet – What is Wrong with this Quote?

3 07 2011

July 3, 2011 (Modified July 5, 2011)

Define Yah-but: Almost like yeah but, but with one missing letter.

While reading the alpha edition of the “Oracle Database 11g Performance Tuning Recipes” book, specifically recipe 5-19, I found a couple of interesting comments about the CURSOR_SHARING parameter, specifically the FORCE and SIMILAR parameter values.  The quotes (as usual, trying to minimize the amount of material that is quoted without losing the context of the quotes):

“The CURSOR_SHARING parameter is one of the few Oracle Silver bullets that’ll improve database performance immediately by eliminating latch contention. Use it with confidence when dealing with library cache latch contention.”

“Thus, the SIMILAR setting seems a safer way to go about forcing the database to use bind variables instead of literals. Although there are some concerns about the safety of setting of the CURSOR_SHARING parameter to FORCE, we haven’t seen any real issues with using this setting.”

I would have expected to find the phrase silver bullet in another book, such as the book “Expert Oracle Database Architecture: Oracle Database Programming 9i, 10g, and 11g Techniques and Solutions, Second Edition“, specifically page 48:

“Note: There are no silver bullets, none. If there were any, they would be the default behavior and you would never hear about them.”

Or possibly in a presentation titled “Playing Russian Roulette with Silver Bullets”, specifically Page 14 (slide 27)

The phrase might also appear in the book “Oracle Tuning the Definitive Reference, Second Edition” specifically page 105 (the phrase silver bullet actually appears on page 102, this quote is related to the second half of the quote at the start of this article):

“Note that in Oracle 11g, cursor_sharing=similar has been debugged, and it is now possible to use cursor sharing with bind variable peeking.”

Strangely, the phrase silver bullet might also appear on a page found in the toadworld.com domain (Edit: July 5, 2011: I suspect that the same advice might also be found in that author’s recent book Per the book author’s comment attached below, this assumption is incorrect):

“The CURSOR_SHARING is one of the few Oracle parameters that can have a ‘silver bullet’ effect – instantly and dramatically increasing throughput on a parse-constrained database.”

I still have fond memories of the problems caused by the October 2006 patch for Oracle Database 10.2.0.2 that were related to setting the CURSOR_SHARING parameter to FORCE.  That problem seems to be related to this article: Metalink (MOS) Doc ID 7272297.8, Bug 7272297 – “Memory corruption / OERI[17114] / OERI[17125] with literal replacement”.

Of course there are plenty of other resources that suggest utilizing the CURSOR_SHARING parameter to tune performance, including this article that suggests changing that parameter value to SIMILAR. 

Is changing the CURSOR_SHARING parameter from EXACT to either FORCE or SIMILAR a good idea, much less a silver bullet?  Was it only a problem with Oracle Database release versions prior to 11.1? 

A couple of additonal resources to help you decide:

  • Carol Dacko reports that the SIMILAR parameter value for the CURSOR_SHARING parameter is deprecated (obsolete) as of Oracle Database 11.1 (and it will be removed in 12.1) per Metalink (MOS) Doc ID: 1169017.1
  • The Oracle Optimizer Group answers the question: “Why do I have hundreds of child cursors when cursor_sharing is set to similar in 10g”
  • The Oracle Optimizer Group explains adaptive cursor sharing behavior with cursor_sharing = similar and force
  • Kyle Hailey and Randolf Geist discuss the CURSOR_SHARING parameter in a blog article titled “Cursor_sharing: a picture is worth a 1000 words




Finding Rows with Common Attributes – Roman to Find a Solution in Something New

1 07 2011

July 1, 1011

A recent thread in the comp.databases.oracle.server Usenet group (actually two threads) asked an interesting question.  Assume that you had a detail table that contained several attributes for each of the unique key values.  How would one go about finding all of the unique key values that share the same set of attributes?  The sample set provided by the OP looks like this:

COL1   COL2
----  -----
I         a
I         b
I         c
II        a
II        b
III       a
III       b
III       c 

For the above, assume that the OP was interested in the attributes of “I”: a,b,c.  “II” lacks a “c” attribute, while “III” has the required “a”, “b”, and “c” attributes.  So, the OP would like to return C1 value “III” but not “II”.  I wonder if there is a simple solution for the OP?

First, let’s create our test data.  COL1 appears to contain Roman numbers – if we go beyond the number 3, those could be tricky to generate (unless of course you find the RN format parameter for the TO_CHAR function).  Let’s first create a temporary work table that contains the Roman numbers from 1 to 100 and a random number between 1 and 10:

CREATE TABLE T1_TEMP AS
SELECT
  TRIM(TO_CHAR(ROWNUM,'RN')) C1,
  TRUNC(DBMS_RANDOM.VALUE(1,10)+1) C2,
  ROWNUM C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100; 

Let’s see what is in the T1_TEMP table:

COLUMN C1 FORMAT A10

SELECT
  *
FROM
  T1_TEMP
ORDER BY
  C3;

C1                 C2         C3
---------- ---------- ----------
I                  10          1
II                  4          2
III                 7          3
IV                  9          4
V                   8          5
VI                 10          6
VII                 9          7
VIII                4          8
IX                  4          9
X                  10         10
...
XCV                 5         95
XCVI                4         96
XCVII               8         97
XCVIII              7         98
XCIX               10         99
C                   4        100

100 rows selected. 

The row with the value “I” in column C1 has the number 10 in column C2, but that number might be a bit different in your temporary work table.  Column C2 will determine the number of attributes that are added for each of the values found in column C1 when we create the table T1 (note that we could have defined column C2 with the function CHR(96 + COUNTER) to place lowercase letters in that column, rather than numbers, to help reproduce the OP’s dataset):

CREATE TABLE T1 AS
SELECT
  T1_TEMP.C1,
  V1.COUNTER C2
FROM
  T1_TEMP,
  (SELECT
     ROWNUM COUNTER
   FROM
     DUAL
   CONNECT BY
     LEVEL<=10) V1
WHERE
  T1_TEMP.C2>=V1.COUNTER; 

Let’s see what is in table T1:

SELECT
  *
FROM
  T1
ORDER BY
  C1,
  C2;

C1                 C2
---------- ----------
C                   1
C                   2
C                   3
C                   4
I                   1
I                   2
I                   3
I                   4
I                   5
I                   6
I                   7
I                   8
I                   9
I                  10
...
XXXVII              1
XXXVII              2
XXXVII              3
XXXVIII             1
XXXVIII             2
XXXVIII             3
XXXVIII             4
XXXVIII             5
XXXVIII             6

634 rows selected.  

From the above output, you can see that we now have the number of rows in table T1 for each distinct value of C1 as was specified in table T1_TEMP.  An interesting side-note, the Roman number 100 (C) is less than the Roman number 1 (I) – I guess that explains why computers do not natively use Roman numbers for calculations.  🙂

For the next step, we need to collapse the different C2 values for each of the unique C1 values into a single row.  Oracle Database 11.2.0.1 introduced the LISTAGG function that makes easy work of this task, as shown in this earlier blog article.

COLUMN C2_LISTING FORMAT A22

SELECT
  C1,
  LISTAGG(TO_CHAR(C2), ',') WITHIN GROUP (ORDER BY C2) C2_LISTING
FROM
  T1
GROUP BY
  C1
ORDER BY
  C1;

C1         C2_LISTING
---------- --------------------
C          1,2,3,4
I          1,2,3,4,5,6,7,8,9,10
II         1,2,3,4
III        1,2,3,4,5,6,7
IV         1,2,3,4,5,6,7,8,9
IX         1,2,3,4
...
XXXV       1,2,3,4,5,6,7,8,9
XXXVI      1,2,3,4,5
XXXVII     1,2,3
XXXVIII    1,2,3,4,5,6

100 rows selected. 

The question remains, how can I find all of the unique C1 values that have all of the same attributes as the C1 value “I” – in this case 1,2,3,4,5,6,7,8,9,10?  One method slides the above query into a WITH block and then the WITH block is referenced twice in the main query:

WITH MY_VIEW AS
(SELECT
  C1,
  LISTAGG(TO_CHAR(C2), ',') WITHIN GROUP (ORDER BY C2) C2_LISTING
FROM
  T1
GROUP BY
  C1)
SELECT
  V2.C1,
  V2.C2_LISTING
FROM
  MY_VIEW V1,
  MY_VIEW V2
WHERE
  V1.C1='I'
  AND V1.C1<>V2.C1
  AND V1.C2_LISTING=V2.C2_LISTING
ORDER BY
  V2.C1;

C1         C2_LISTING
---------- --------------------
LVII       1,2,3,4,5,6,7,8,9,10
LXXI       1,2,3,4,5,6,7,8,9,10
LXXIII     1,2,3,4,5,6,7,8,9,10
VI         1,2,3,4,5,6,7,8,9,10
X          1,2,3,4,5,6,7,8,9,10
XCIX       1,2,3,4,5,6,7,8,9,10
XV         1,2,3,4,5,6,7,8,9,10
XXIX       1,2,3,4,5,6,7,8,9,10
XXXI       1,2,3,4,5,6,7,8,9,10

9 rows selected. 

How else might you solve the problem posted by the OP in the Usenet thread?