Matching the Expected Output – Analytic RANK, ROW_NUMBER, DENSE_RANK, or Something Different?

16 11 2011

November 16, 2011

I noticed an interesting request on the comp.databases.oracle.misc usenet group, and I thought that I would make an attempt at solving the request.  The original poster (OP) stated that he had a table with data similar to the following:

TX   ID   DEPT   LOCATION   LOAD
1    99    A        NY       12
2    99    A        LA       10
3    99    B        LA       05
4    77    B        LA       15
5    77    C        NY       12
6    77    D        LA       11  

He would like to obtain the following output:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11  

The rankings are to be determined as follows:

DEPT_RANK for ID 99 is 1 for A because sum(LOAD) = 22 is the max
LOC_RANK for ID 99 is 1 for LA because sum(LOAD) = 15 is the max

At first glance, that request seems to be reasonably easy to accomplish.  Let’s start by creating a table with the sample data (ideally, the OP should have provided the DDL and DML to create and populate this table):

DROP TABLE T1 PURGE;

CREATE TABLE T1 (
  TX NUMBER,
  ID NUMBER,
  DEPT VARCHAR2(1),
  LOCATION VARCHAR2(2),
  LOAD NUMBER);

INSERT INTO T1 VALUES (1,99,'A','NY',12);
INSERT INTO T1 VALUES (2,99,'A','LA',10);
INSERT INTO T1 VALUES (3,99,'B','LA',05);
INSERT INTO T1 VALUES (4,77,'B','LA',15);
INSERT INTO T1 VALUES (5,77,'C','NY',12);
INSERT INTO T1 VALUES (6,77,'D','LA',11); 

The first step, if we were to think about creating the solution in logical steps, is to find a way to calculate the SUM values that were mentioned by the OP.  So, as a starting point, we might try this:

SELECT
  TX,
  ID,
  DEPT,
  SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
  LOCATION,
  SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
  LOAD
FROM
  T1
ORDER BY
  TX;

 TX  ID D SUM_LOAD_ID LO SUM_LOAD_LOCATION  LOAD
--- --- - ----------- -- ----------------- -----
  1  99 A          22 NY                12    12
  2  99 A          22 LA                15    10
  3  99 B           5 LA                15     5
  4  77 B          15 LA                26    15
  5  77 C          12 NY                12    12
  6  77 D          11 LA                26    11 

If I am understanding the OP’s request correctly, the above is a good starting point (even though the alias for the first analytic function could have been better selected).

We are then able to take the above SQL statement and push it into an inline view to hopefully produce the output that is expected by the OP  (note that the PARTITION clause differs for the LOC_RANK column from what is specified in the inline view for the function that is used to create that column):

SELECT
  TX,
  ID,
  DEPT,
  RANK() OVER (PARTITION BY ID,DEPT ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  RANK() OVER (PARTITION BY ID,DEPT ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          2    12
  2  99 A          1 LA          1    10
  3  99 B          1 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          1 NY          1    12
  6  77 D          1 LA          1    11 

Let’s compare the above output with what the OP requested:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11

Almost for the LOC_RANK column (not even close for the DEPT_RANK column), but not quite right.  The problem is that when attempting to calculate the RANK columns in the above output, we should only PARTITION on the ID column, not the ID column and some other column, as was the case when we used the SUM analytic function.

Let’s fix the PARTITION clause and try again:

SELECT
  TX,
  ID,
  DEPT,
  RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          3    12
  2  99 A          1 LA          1    10
  3  99 B          3 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          2 NY          3    12
  6  77 D          3 LA          1    11 

Let’s compare the above output with what the OP requested:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11

Notice in the above that when two rows have the same SUM_LOAD_ value, the displayed rank is correct, but that repeated rank value then causes a rank value to be skipped (compare the DEPT_RANK column value on row 3, the LOC_RANK column value on row 1, and the LOC_RANK column value on row 5).

Now what?  The ROW_NUMBER function could be used to produce sequential rank numbers without gaps, for example:

SELECT
  TX,
  ID,
  DEPT,
  ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          3    12
  2  99 A          2 LA          2    10
  3  99 B          3 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          2 NY          3    12
  6  77 D          3 LA          2    11 

The above output, as mentioned, does not match the output requested by the OP, since the OP’s requested output specifes that equal values for different rows should show the same rank value.

One more try using the DENSE_RANK analytic function:

SELECT
  TX,
  ID,
  DEPT,
  DENSE_RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  DENSE_RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          2    12
  2  99 A          1 LA          1    10
  3  99 B          2 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          2 NY          2    12
  6  77 D          3 LA          1    11 

Let’s compare the above output with what the OP requested:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11

As best as I am able to determine, the above SQL statement will satisfy the OP’s request.

—–

Part 2 of the Challenge

If the OP has the following SQL statement:

SELECT
  LOAD_YEAR,
  ORG_UNIT_CODE,
  RANK () OVER (PARTITION BY LOAD_YEAR, CLASSIF_CODE ORDER BY SUM (FTE_DAYS) DESC) ORG_RANK,
  CLASSIF_CODE,
  RANK () OVER (PARTITION BY LOAD_YEAR, ORG_UNIT_CODE ORDER BY SUM (FTE_DAYS) DESC) CLASSIF_RANK,
  SUM (FTE_DAYS) FTE
FROM
  STAFF_LOAD
GROUP BY
  LOAD_YEAR,
  ORG_UNIT_CODE,
  CLASSIF_CODE;  

And the above SQL statement produces the following output:

YEAR ORG_UNIT_CODE ORG_RANK CLASSIF_CODE CLASSIF_RANK FTE
2010 A46                  1 HEW3                    1  59
2010 A42                  2 HEW3                    1  13
2010 A42                  1 HEW4                    1  13
2010 A46                  2 HEW4                    2  12

And the OP wants the output to look like this:

YEAR ORG_UNIT_CODE ORG_RANK CLASSIF_CODE CLASSIF_RANK FTE
2010 A46                  1 HEW3                    1  59
2010 A42                  2 HEW3                    1  13
2010 A42                  2 HEW4                    2  13
2010 A46                  1 HEW4                    2  12  

Write the DDL and DML statements to create the source table and populate it with the non-aggregated original data, and then produce the output requested by the OP (without looking at the updated usenet thread).





Simple SQL with and without Inline Views

21 10 2011

October 21, 2011

(Forward to the Next Post in the Series)

Sometimes it is interesting to take a step back from some of the more difficult to construct SQL statements, and just put together something simple.  The following request recently arrived through an ERP mailing list:

“I want to find the last ship date for all the open part numbers in Customer Order Entry. The last ship date is not necessarily from the Open order in Customer Order Entry…”

The original poster (OP) provided more information, and even provided a description of what was already tried as a solution.  It is not exceptionally clear what the OP wanted, but because I have several years of experience working with this particular ERP system, I had a feeling that the OP might be headed in the wrong direction.  This ERP system uses tables that are well normalized, which would make it a bit more difficult to locate the most recent ship date for a part number that is recorded with one customer order’s tables’ rows when another customer order’s tables’ rows are reported.  Unless, of course, we remember that there is also a transaction table that records every inventory movement of part numbers through the system – we just need to know what those rows in the transaction table look like.  In this case, the rows have the following characteristics:

  • The CUST_ORDER_ID column value must not be NULL
  • The TYPE column must have a value of ‘O’
  • The CLASS column must have a value of ’I’

With that knowledge, we could then find the most recent date that any part number shipped with a SQL statement similar to the following:

SELECT
  PART_ID,
  MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS
WHERE
  CUST_ORDER_ID IS NOT NULL
  AND TYPE='O'
  AND CLASS='I'
GROUP BY
  PART_ID;

In the above, the AS keyword is optional on an Oracle Database platform, but it is not optional on some other database platforms.

With an Oracle Database backend, inline views may be used to permit the above SQL statement to be used to retrieve additional information for a parent SQL statement.  If we wrap the above SQL statement in ( ) and place it in the FROM clause of the SQL statement, Oracle will treat the data returned by the above SQL statement quite similar to how it would handle a regular table (the Oracle query optimizer might decide to rewrite the combined SQL statement into an equivalent form that no longer contains an inline view).  There is always a chance that a part number that is on order may have never shipped to a customer, so we will need to outer join (in this case using a left outer join) to the newly created inline view:

SELECT
  COL.CUST_ORDER_ID,
  COL.PART_ID,
  LS.LAST_TRANSACTION_DATE
FROM
  CUSTOMER_ORDER CO,
  CUST_ORDER_LINE COL,
  (SELECT
    PART_ID,
    MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
  FROM
    INVENTORY_TRANS
  WHERE
    CUST_ORDER_ID IS NOT NULL
    AND TYPE='O'
    AND CLASS='I'
  GROUP BY
    PART_ID) LS
WHERE
  CO.ID=COL.CUST_ORDER_ID
  AND CO.STATUS IN ('U','F','R')
  AND COL.LINE_STATUS='A'
  AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY
  AND COL.PART_ID=LS.PART_ID(+);

In the above, I joined the CUSTOMER_ORDER and CUST_ORDER_LINE tables to the inline view that I aliased as LS.  A third table, CUST_LINE_DEL, that optionally contains the delivery schedule for some of the rows in the CUST_ORDER_LINE table, could have also been outer joined to the CUST_ORDER_LINE table.

Running SQL Server, or just feeling ANSI?  If so, the above may be rewritten as follows:

SELECT
  COL.CUST_ORDER_ID,
  COL.PART_ID,
  LS.LAST_TRANSACTION_DATE
FROM
  CUSTOMER_ORDER CO
JOIN
  CUST_ORDER_LINE COL
ON
  CO.ID=COL.CUST_ORDER_ID
LEFT OUTER JOIN
  (SELECT
    PART_ID,
    MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
  FROM
    INVENTORY_TRANS
  WHERE
    CUST_ORDER_ID IS NOT NULL
    AND TYPE='O'
    AND CLASS='I'
  GROUP BY
    PART_ID) LS
ON
  COL.PART_ID=LS.PART_ID
WHERE
  CO.STATUS IN ('U','F','R')
  AND COL.LINE_STATUS='A'
  AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY;

Unfortunately, the OP is actually using a SQLBase database backend that does not support inline views.  I remember the feeling before I discovered that Oracle Database supported inline views… in that case I would do something like the following:

  1. Create a statically defined view.
  2. Join to that statically defined view just as if the view were a table.
CREATE VIEW CUST_ORDER_PART_LS AS
SELECT
  PART_ID,
  MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS
WHERE
  CUST_ORDER_ID IS NOT NULL
  AND TYPE='O'
  AND CLASS='I'
GROUP BY
  PART_ID;

SELECT
  COL.CUST_ORDER_ID,
  COL.PART_ID,
  LS.LAST_TRANSACTION_DATE
FROM
  CUSTOMER_ORDER CO,
  CUST_ORDER_LINE COL,
  CUST_ORDER_PART_LS LS
WHERE
  CO.ID=COL.CUST_ORDER_ID
  AND CO.STATUS IN ('U','F','R')
  AND COL.LINE_STATUS='A'
  AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY
  AND COL.PART_ID=LS.PART_ID(+);

I guess that it is a good exercise once in a while to practice simple SQL.

A second example from the ERP mailing list showed a SQL Server solution for a particular problem.  The particular problem that the SQL Server solution set out to solve is as follows:

“Someone posed an interesting question to me – How can you do a mass insert of Operations? Well most of us know how to manage that without too much trouble but this one came with a couple of caveats – The Master’s first operation (lowest numbered) has to have a particular ResourceID and then a different but specific ResourceID must be on a subsequent Operation. It is after the second operation where the insert must be placed.

Sounds like fun – So I figured that it could be done in a single SQL statement – well after a 4-way join of the Operation table to itself I got it to work how I expected…”

Interesting, I think that there are times in the past that I have used multiple self-joins to a table in order to solve similar SQL problems.  However, there is an easier way using analytic functions.  The following SQL statement attempts to indicate: the sequence of the operation within the work order sub ID, the sequence number that follows sequentially, the second sequence number that follows sequentially, and the resource ID of the next operation in sequential order by the operation number:

SELECT
  WORKORDER_BASE_ID,
  WORKORDER_LOT_ID,
  WORKORDER_SPLIT_ID,
  WORKORDER_SUB_ID,
  SEQUENCE_NO,
  RESOURCE_ID,
  ROW_NUMBER() OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                     ORDER BY RESOURCE_ID) AS RN,
  LEAD(RESOURCE_ID,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                            ORDER BY RESOURCE_ID) AS NEXT_RESOURCE_ID,
  LEAD(SEQUENCE_NO,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                            ORDER BY RESOURCE_ID) AS NEXT_SEQ_NO,
  LEAD(SEQUENCE_NO,2) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                            ORDER BY RESOURCE_ID) AS NEXT_NEXT_SEQ_NO
FROM
  OPERATION
WHERE
  WORKORDER_TYPE='M'; 

Next, we only care about those rows when our resource ID of interest (69 in my example) is specified in the first operation on the work order sub ID and the second operation on that work order sub ID specifies the other resource ID that is of interest (FW in my example).  If we are to avoid primary key violations, we should select a new sequence number that is half way between the next-next operation and the next operation:

SELECT
  WORKORDER_BASE_ID,
  WORKORDER_LOT_ID,
  WORKORDER_SPLIT_ID,
  WORKORDER_SUB_ID,
  SEQUENCE_NO,
  NEXT_SEQ_NO,
  NEXT_NEXT_SEQ_NO,
  NEXT_SEQ_NO + ROUND((COALESCE(NEXT_NEXT_SEQ_NO,NEXT_SEQ_NO+10) - NEXT_SEQ_NO)/2) NEW_SEQUENCE_NO,
  'NEW RESOURCE' RESOURCE_ID
FROM
  (SELECT
    WORKORDER_BASE_ID,
    WORKORDER_LOT_ID,
    WORKORDER_SPLIT_ID,
    WORKORDER_SUB_ID,
    SEQUENCE_NO,
    RESOURCE_ID,
    ROW_NUMBER() OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                       ORDER BY RESOURCE_ID) AS RN,
    LEAD(RESOURCE_ID,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                              ORDER BY RESOURCE_ID) AS NEXT_RESOURCE_ID,
    LEAD(SEQUENCE_NO,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                              ORDER BY RESOURCE_ID) AS NEXT_SEQ_NO,
    LEAD(SEQUENCE_NO,2) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                              ORDER BY RESOURCE_ID) AS NEXT_NEXT_SEQ_NO
  FROM
    OPERATION
  WHERE
    WORKORDER_TYPE='M')
WHERE
  RN=1
  AND RESOURCE_ID='69'
  AND NEXT_RESOURCE_ID='FW';

WORKORDER_ WOR WOR WOR SEQUENCE_NO NEXT_SEQ_NO NEXT_NEXT_SEQ_NO NEW_SEQUENCE_NO RESOURCE_ID
---------- --- --- --- ----------- ----------- ---------------- --------------- ------------
231610     0   0   0            10         777              888             833 NEW RESOURCE
237680     0   0   0            10         777              888             833 NEW RESOURCE
32018917X  0   0   0            10         777              888             833 NEW RESOURCE
3201B8920  0   0   0            10         777              888             833 NEW RESOURCE
3201C8765  0   0   0            10         777              888             833 NEW RESOURCE
3201G8639  0   0   0            10         777              888             833 NEW RESOURCE
3201G9003  0   0   0            10         777              888             833 NEW RESOURCE
3201J8772  0   0   0            10         777              888             833 NEW RESOURCE
3201J8850  0   0   0            10         777              888             833 NEW RESOURCE
3201K8669  0   0   0            10         777              888             833 NEW RESOURCE
3201M8281  0   0   0            10         777              888             833 NEW RESOURCE
3201M8489  0   0   0            10         777              888             833 NEW RESOURCE
3201N8516  0   0   0            10         777              888             833 NEW RESOURCE
3201N8776  0   0   0            10         777              888             833 NEW RESOURCE
3201Q8545  0   0   0            10         777              888             833 NEW RESOURCE
3201T8964  0   0   0            10         777              888             833 NEW RESOURCE
3201T8964X 0   0   0            10          20               30              25 NEW RESOURCE
3201V8524  0   0   0            10         777              888             833 NEW RESOURCE
3201V8966  0   0   0            10         777              888             833 NEW RESOURCE
3201V8966X 0   0   0            10          20               30              25 NEW RESOURCE
3201W8967  0   0   0            10         777              888             833 NEW RESOURCE
3201W8967X 0   0   0            10          20               30              25 NEW RESOURCE
3201Y8423  0   0   0            10         777              888             833 NEW RESOURCE
3201Z8996  0   0   0            10         777              888             833 NEW RESOURCE

24 rows selected. 

With the above, we are well on our way to perfoming a mass insert into this table.  If I recall correctly, the above must first be inserted into another table (a temp table, preferrably) and then the rows may be inserted into the OPERATION table (the table that is the row source for the analytic functions used in the SQL statement).





Finding Rows with Common Attributes – Roman to Find a Solution in Something New

1 07 2011

July 1, 1011

A recent thread in the comp.databases.oracle.server Usenet group (actually two threads) asked an interesting question.  Assume that you had a detail table that contained several attributes for each of the unique key values.  How would one go about finding all of the unique key values that share the same set of attributes?  The sample set provided by the OP looks like this:

COL1   COL2
----  -----
I         a
I         b
I         c
II        a
II        b
III       a
III       b
III       c 

For the above, assume that the OP was interested in the attributes of “I”: a,b,c.  “II” lacks a “c” attribute, while “III” has the required “a”, “b”, and “c” attributes.  So, the OP would like to return C1 value ”III” but not “II”.  I wonder if there is a simple solution for the OP?

First, let’s create our test data.  COL1 appears to contain Roman numbers – if we go beyond the number 3, those could be tricky to generate (unless of course you find the RN format parameter for the TO_CHAR function).  Let’s first create a temporary work table that contains the Roman numbers from 1 to 100 and a random number between 1 and 10:

CREATE TABLE T1_TEMP AS
SELECT
  TRIM(TO_CHAR(ROWNUM,'RN')) C1,
  TRUNC(DBMS_RANDOM.VALUE(1,10)+1) C2,
  ROWNUM C3
FROM
  DUAL
CONNECT BY
  LEVEL<=100; 

Let’s see what is in the T1_TEMP table:

COLUMN C1 FORMAT A10

SELECT
  *
FROM
  T1_TEMP
ORDER BY
  C3;

C1                 C2         C3
---------- ---------- ----------
I                  10          1
II                  4          2
III                 7          3
IV                  9          4
V                   8          5
VI                 10          6
VII                 9          7
VIII                4          8
IX                  4          9
X                  10         10
...
XCV                 5         95
XCVI                4         96
XCVII               8         97
XCVIII              7         98
XCIX               10         99
C                   4        100

100 rows selected. 

The row with the value “I” in column C1 has the number 10 in column C2, but that number might be a bit different in your temporary work table.  Column C2 will determine the number of attributes that are added for each of the values found in column C1 when we create the table T1 (note that we could have defined column C2 with the function CHR(96 + COUNTER) to place lowercase letters in that column, rather than numbers, to help reproduce the OP’s dataset):

CREATE TABLE T1 AS
SELECT
  T1_TEMP.C1,
  V1.COUNTER C2
FROM
  T1_TEMP,
  (SELECT
     ROWNUM COUNTER
   FROM
     DUAL
   CONNECT BY
     LEVEL<=10) V1
WHERE
  T1_TEMP.C2>=V1.COUNTER; 

Let’s see what is in table T1:

SELECT
  *
FROM
  T1
ORDER BY
  C1,
  C2;

C1                 C2
---------- ----------
C                   1
C                   2
C                   3
C                   4
I                   1
I                   2
I                   3
I                   4
I                   5
I                   6
I                   7
I                   8
I                   9
I                  10
...
XXXVII              1
XXXVII              2
XXXVII              3
XXXVIII             1
XXXVIII             2
XXXVIII             3
XXXVIII             4
XXXVIII             5
XXXVIII             6

634 rows selected.  

From the above output, you can see that we now have the number of rows in table T1 for each distinct value of C1 as was specified in table T1_TEMP.  An interesting side-note, the Roman number 100 (C) is less than the Roman number 1 (I) – I guess that explains why computers do not natively use Roman numbers for calculations.  :-)

For the next step, we need to collapse the different C2 values for each of the unique C1 values into a single row.  Oracle Database 11.2.0.1 introduced the LISTAGG function that makes easy work of this task, as shown in this earlier blog article.

COLUMN C2_LISTING FORMAT A22

SELECT
  C1,
  LISTAGG(TO_CHAR(C2), ',') WITHIN GROUP (ORDER BY C2) C2_LISTING
FROM
  T1
GROUP BY
  C1
ORDER BY
  C1;

C1         C2_LISTING
---------- --------------------
C          1,2,3,4
I          1,2,3,4,5,6,7,8,9,10
II         1,2,3,4
III        1,2,3,4,5,6,7
IV         1,2,3,4,5,6,7,8,9
IX         1,2,3,4
...
XXXV       1,2,3,4,5,6,7,8,9
XXXVI      1,2,3,4,5
XXXVII     1,2,3
XXXVIII    1,2,3,4,5,6

100 rows selected. 

The question remains, how can I find all of the unique C1 values that have all of the same attributes as the C1 value “I” – in this case 1,2,3,4,5,6,7,8,9,10?  One method slides the above query into a WITH block and then the WITH block is referenced twice in the main query:

WITH MY_VIEW AS
(SELECT
  C1,
  LISTAGG(TO_CHAR(C2), ',') WITHIN GROUP (ORDER BY C2) C2_LISTING
FROM
  T1
GROUP BY
  C1)
SELECT
  V2.C1,
  V2.C2_LISTING
FROM
  MY_VIEW V1,
  MY_VIEW V2
WHERE
  V1.C1='I'
  AND V1.C1<>V2.C1
  AND V1.C2_LISTING=V2.C2_LISTING
ORDER BY
  V2.C1;

C1         C2_LISTING
---------- --------------------
LVII       1,2,3,4,5,6,7,8,9,10
LXXI       1,2,3,4,5,6,7,8,9,10
LXXIII     1,2,3,4,5,6,7,8,9,10
VI         1,2,3,4,5,6,7,8,9,10
X          1,2,3,4,5,6,7,8,9,10
XCIX       1,2,3,4,5,6,7,8,9,10
XV         1,2,3,4,5,6,7,8,9,10
XXIX       1,2,3,4,5,6,7,8,9,10
XXXI       1,2,3,4,5,6,7,8,9,10

9 rows selected. 

How else might you solve the problem posted by the OP in the Usenet thread?





Row Values to Comma Separated Lists, an Overly Complicated Use Case Example

26 05 2011

May 26, 2011

In a previous article I showed a simple method to convert values found in multiple table rows into a comma delimited list.  The method works very well, until the situation in which the approach is to be used becomes a bit more complex.

Assume for a moment that the following table structure exists where the approach must be implemented:

In the above diagram, the PURCHASE_ORDER table is the parent table, the PURC_ORDER_LINE table is the child table (the line-level detail), and the PURC_LINE_DEL table is the grand-child table (the sub-line level detail).  A fairly simple setup, and I suspect not a too terribly uncommon setup.

Now the complexity.  Assume that an ERP package using the above tables generates a report using the tables’ data, and as the ERP package is flexible, it allows additional information to be added to the reports – essentially, it is possible to execute a query that retrieves a single row for every detail-level row in the report.  Great, except that in this particular report, the detail-level row level is at the PURC_ORDER_LINE table.  If there were 18 rows, for instance, in the PURC_LINE_DEL table (the schedule that indicates when the parts should be delivered) for the first order line in the PURC_ORDER_LINE table for a particular order ID that is found in the PURCHASE_ORDER table, then we have a bit of a problem.  Fortunately, this ERP package’s report was constructed such that the DESIRED_RECV_DATE and USER_ORDER_QTY columns are represented in two report variables with CRLF (carriage return – line feed) delimited lists, making it very easy to list the delivery dates that the parts are due, and the quantities that are due for each of those dates.  Fantastic, so where is the complexity?  What if the purchase order must be revised after the supplier starts sending the parts on the agreed upon delivery dates, such that the first 9 of the 18 delivery schedules for line 1 of a purchase order have been satisfied, 1 of the 18 delivery schedule lines has been partially satisfied, and the remaining 8 delivery schedule lines have dates that are in the future?  This might be helpful information to provide to the supplier when presenting the updated purchase order.  Unfortunately, the ERP package’s report does not deliver the USER_RECEIVED_QTY column as a CRLF delimited list, as it does for the DESIRED_RECV_DATE and USER_ORDER_QTY columns. 

A complex problem, or not?  Well, the ERP package does permit a SQL statement returning a single row to be executed per detail row in the report (at the PURC_ORDER_LINE table level), so we have a possible solution.  However, if the SQL statement is too complex (having inline views is one such complication), the complex portion of the SQL statement must be isolated in a statically defined view, and then that view should be referenced by the report.  A complex problem, or not?  Let’s try building a solution starting in SQL*Plus:

SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  USER_RECEIVED_QTY,
  ROW_NUMBER() OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
ORDER BY
  DEL_SCHED_LINE_NO;

PURC_ORDER_ID   PURC_ORDER_LINE_NO USER_RECEIVED_QTY         RN
--------------- ------------------ ----------------- ----------
146825                           1                 1          1
146825                           1                 1          2
146825                           1                 2          3
146825                           1                 2          4
146825                           1                 2          5
146825                           1                 3          6
146825                           1                 3          7
146825                           1                 1          8
146825                           1                 3          9
146825                           1                 2         10
146825                           1                 0         11
146825                           1                 0         12
146825                           1                 0         13
146825                           1                 0         14
146825                           1                 0         15
146825                           1                 0         16
146825                           1                 0         17
146825                           1                 0         18 

Simple so far, that line identified by RN 10 requests an order quantity of 3, but the supplier only delivered 2 of the 3 parts for that date.  Now let’s generate a comma separated list using the SYS_CONNECT_BY_PATH function:

SELECT
  SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),',') LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  USER_RECEIVED_QTY,
  ROW_NUMBER() OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
ORDER BY
  DEL_SCHED_LINE_NO) V
CONNECT BY PRIOR
  RN = RN-1
START WITH
  RN = 1;

LISTING
------------------------------------

,1
,1,1
,1,1,2
,1,1,2,2
,1,1,2,2,2
,1,1,2,2,2,3
,1,1,2,2,2,3,3
,1,1,2,2,2,3,3,1
,1,1,2,2,2,3,3,1,3
,1,1,2,2,2,3,3,1,3,2
,1,1,2,2,2,3,3,1,3,2,0
,1,1,2,2,2,3,3,1,3,2,0,0
,1,1,2,2,2,3,3,1,3,2,0,0,0
,1,1,2,2,2,3,3,1,3,2,0,0,0,0
,1,1,2,2,2,3,3,1,3,2,0,0,0,0,0
,1,1,2,2,2,3,3,1,3,2,0,0,0,0,0,0
,1,1,2,2,2,3,3,1,3,2,0,0,0,0,0,0,0
,1,1,2,2,2,3,3,1,3,2,0,0,0,0,0,0,0,0 

We only want the last row returned from the above result, and we need to change those commas to a CRLF character combination (after eliminating the initial comma):

SELECT
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  USER_RECEIVED_QTY,
  ROW_NUMBER() OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
ORDER BY
  DEL_SCHED_LINE_NO) V
CONNECT BY PRIOR
  RN = RN-1
START WITH
  RN = 1;

LISTING
-------

1
1
2
2
2
3
3
1
3
2
0
0
0
0
0
0
0
0

Looks like mission accomplished, all of the previously received quantities are returned in a single query row… except now we have a complicated query that must be isolated in a statically defined view (silly rules…).

Before we head down the path of isolating the complex portion of the query, I thought that it might be entertaining to examine a couple of execution plan to see what happens when we isolate the complicated portion of the query in a statically defined view.  First, we will take a look at the last of the above SQL statements:

SET LINESIZE 140
SET TRIMSPOOL ON
SET PAGESIZE 1000
SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  USER_RECEIVED_QTY,
  ROW_NUMBER() OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
ORDER BY
  DEL_SCHED_LINE_NO) V
CONNECT BY PRIOR
  RN = RN-1
START WITH
  RN = 1;

Plan hash value: 929864805

----------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |               |     1 |    26 |     4  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE                          |               |     1 |    26 |            |          |
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|               |       |       |            |          |
|   3 |    VIEW                                  |               |     4 |   104 |     3   (0)| 00:00:01 |
|   4 |     WINDOW NOSORT                        |               |     4 |    76 |     3   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID         | PURC_LINE_DEL |     4 |    76 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN                   | SYS_C0021790  |     4 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("RN"-1=PRIOR "RN")
       filter("RN"=1)
   6 - access("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1) 

The above looks simple, and the query really does execute quickly.  For the above complex portion of the query to work in a statically defined view, we need to relocate the WHERE clause outside of the inline view, so that when the report runs, it is able to pass in the appropriate values for the PURC_ORDER_ID and PURC_ORDER_LINE_NO columns.  Here is the execution plan (generated on Oracle Database 11.2.0.2) when the WHERE clause is relocated:

SELECT
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  USER_RECEIVED_QTY,
  ROW_NUMBER() OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
ORDER BY
  DEL_SCHED_LINE_NO) V
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
CONNECT BY PRIOR
  RN = RN-1
START WITH
  RN = 1;

Plan hash value: 2107147172

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |     1 |    48 |       |  1193  (50)| 00:00:05 |
|   1 |  SORT AGGREGATE                           |               |     1 |    48 |       |            |          |
|*  2 |   FILTER                                  |               |       |       |       |            |          |
|*  3 |    CONNECT BY NO FILTERING WITH START-WITH|               |       |       |       |            |          |
|   4 |     VIEW                                  |               | 53851 |  2524K|       |   625   (3)| 00:00:03 |
|   5 |      SORT ORDER BY                        |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   6 |       WINDOW SORT                         |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   7 |        TABLE ACCESS FULL                  | PURC_LINE_DEL | 53851 |   999K|       |    57   (8)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1)
   3 - access("RN"-1=PRIOR "RN")
       filter("RN"=1) 

The above looks to be reasonably sane, although the predicted amount of time jumped from about 1 second to about 5 seconds.  The Predicate Information section of the execution plan indicates that the filter for the specified PURC_ORDER_ID and PURC_ORDER_LINE_NO was not pushed into the inline view as I had hoped.  The actual execution time for this simple query is much longer than 5 seconds, spending 100% of its time burning CPU time (I killed the query after waiting several minutes).  No problem, I will just use a hint to force the optimizer to push those predicates into the inline view:

SELECT /*+ PUSH_PRED(V) */
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  USER_RECEIVED_QTY,
  ROW_NUMBER() OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
ORDER BY
  DEL_SCHED_LINE_NO) V
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
CONNECT BY PRIOR
  RN = RN-1
START WITH
  RN = 1;

Plan hash value: 2107147172

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |     1 |    48 |       |  1193  (50)| 00:00:05 |
|   1 |  SORT AGGREGATE                           |               |     1 |    48 |       |            |          |
|*  2 |   FILTER                                  |               |       |       |       |            |          |
|*  3 |    CONNECT BY NO FILTERING WITH START-WITH|               |       |       |       |            |          |
|   4 |     VIEW                                  |               | 53851 |  2524K|       |   625   (3)| 00:00:03 |
|   5 |      SORT ORDER BY                        |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   6 |       WINDOW SORT                         |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   7 |        TABLE ACCESS FULL                  | PURC_LINE_DEL | 53851 |   999K|       |    57   (8)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1)
   3 - access("RN"-1=PRIOR "RN")
       filter("RN"=1) 

Well, the above is disappointing.  I could dig through a 10053 trace to try to determine why the hint was “ignored“, or I could try another query that produces an equivalent result.  I choose the latter.  Let’s try using the COUNT analytic function to replace the ROW_NUMBER analytic function:

SELECT
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  DEL_SCHED_LINE_NO,
  USER_RECEIVED_QTY,
  COUNT(*) OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
ORDER BY
  DEL_SCHED_LINE_NO) V
CONNECT BY PRIOR
  RN=RN-1
START WITH
  RN=1;

Plan hash value: 617003335

----------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |               |     1 |    26 |     4  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE                          |               |     1 |    26 |            |          |
|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|               |       |       |            |          |
|   3 |    VIEW                                  |               |     4 |   104 |     3   (0)| 00:00:01 |
|   4 |     WINDOW BUFFER                        |               |     4 |    76 |     3   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID         | PURC_LINE_DEL |     4 |    76 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN                   | SYS_C0021790  |     4 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("RN"-1=PRIOR "RN")
       filter("RN"=1)
   6 - access("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1) 

The execution plan appears to be reasonable – the predicates on the PURC_ORDER_ID and PURC_ORDER_LINE_NO columns are applied early, so the query executes quickly.  Now to move the WHERE clause:

SELECT
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  DEL_SCHED_LINE_NO,
  USER_RECEIVED_QTY,
  COUNT(*) OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
ORDER BY
  DEL_SCHED_LINE_NO) V
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
CONNECT BY PRIOR
  RN=RN-1
START WITH
  RN=1;

Plan hash value: 2107147172

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |     1 |    48 |       |  1193  (50)| 00:00:05 |
|   1 |  SORT AGGREGATE                           |               |     1 |    48 |       |            |          |
|*  2 |   FILTER                                  |               |       |       |       |            |          |
|*  3 |    CONNECT BY NO FILTERING WITH START-WITH|               |       |       |       |            |          |
|   4 |     VIEW                                  |               | 53851 |  2524K|       |   625   (3)| 00:00:03 |
|   5 |      SORT ORDER BY                        |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   6 |       WINDOW SORT                         |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   7 |        TABLE ACCESS FULL                  | PURC_LINE_DEL | 53851 |   999K|       |    57   (8)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1)
   3 - access("RN"-1=PRIOR "RN")
       filter("RN"=1) 

As had happened before, the predicates for the PURC_ORDER_ID and PURC_ORDER_LINE_NO columns were not pushed into the inline view, so the query executes extremely slow.  Let’s try the PUSH_PRED hint again:

SELECT /*+ PUSH_PRED(V) */
  MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(TO_CHAR(USER_RECEIVED_QTY),','),2),',',CHR(13)||CHR(10))) LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  DEL_SCHED_LINE_NO,
  USER_RECEIVED_QTY,
  COUNT(*) OVER (PARTITION BY PURC_ORDER_ID,PURC_ORDER_LINE_NO ORDER BY DEL_SCHED_LINE_NO) RN
FROM
  PURC_LINE_DEL
ORDER BY
  DEL_SCHED_LINE_NO) V
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
CONNECT BY PRIOR
  RN=RN-1
START WITH
  RN=1;

Plan hash value: 2107147172

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |     1 |    48 |       |  1193  (50)| 00:00:05 |
|   1 |  SORT AGGREGATE                           |               |     1 |    48 |       |            |          |
|*  2 |   FILTER                                  |               |       |       |       |            |          |
|*  3 |    CONNECT BY NO FILTERING WITH START-WITH|               |       |       |       |            |          |
|   4 |     VIEW                                  |               | 53851 |  2524K|       |   625   (3)| 00:00:03 |
|   5 |      SORT ORDER BY                        |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   6 |       WINDOW SORT                         |               | 53851 |   999K|  1704K|   625   (3)| 00:00:03 |
|   7 |        TABLE ACCESS FULL                  | PURC_LINE_DEL | 53851 |   999K|       |    57   (8)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1)
   3 - access("RN"-1=PRIOR "RN")
       filter("RN"=1) 

The execution plan is unchanged.  Out of luck…

I remembered reading about a new analytic function introduced in Oracle Database 11.2.0.1, what was that new function… LISTAGG?  I am not a fan of that function name, Oracle should have named the function __CREATE_A_DARN_COMMA_SEPARATED_LIST_QUICKLY (a double underscore function, making it a challenge to find the function).  The LISTAGG function is described in the Oracle documentation library for 11.2.  Let’s give the LISTAGG function a try (the SQL statement is also now formatting the USER_RECEIVED_QTY so that the column values consistently show 2 decimal places):

SELECT
  LISTING
FROM
(SELECT
  LISTAGG(TO_CHAR(USER_RECEIVED_QTY,'999,990.00'), CHR(13)||CHR(10)) WITHIN GROUP (ORDER BY DEL_SCHED_LINE_NO) LISTING
FROM
  PURC_LINE_DEL
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1
GROUP BY
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO);

Plan hash value: 3148992471

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |    19 |     3   (0)| 00:00:01 |
|   1 |  SORT GROUP BY               |               |     1 |    19 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL |     4 |    76 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0021790  |     4 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1) 

The calculated cost for the query dropped by 1 and the CPU component of that cost dropped to 0%, so that must be a good thing (the calculated cost can be misleading, just as we saw earlier that the estimated time can be misleading).  Without AUTOTRACE TRACEONLY EXPLAIN enabled, the query executes very quickly and produces output that looks similar to this:

LISTING
-----------

       1.00
       1.00
       2.00
       2.00
       2.00
       3.00
       3.00
       1.00
       3.00
       2.00
       0.00
       0.00
       0.00
       0.00
       0.00
       0.00
       0.00
       0.00
 

OK, let’s re-enable AUTOTRACE TRACEONLY EXPLAIN and try moving the WHERE clause so that we are able to package the complex portion of this query into a statically defined view:

SELECT
  LISTING
FROM
(SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  LISTAGG(TO_CHAR(USER_RECEIVED_QTY,'999,990.00'), CHR(13)||CHR(10)) WITHIN GROUP (ORDER BY DEL_SCHED_LINE_NO) LISTING
FROM
  PURC_LINE_DEL
GROUP BY
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO) V
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1;

Plan hash value: 3148992471

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |    19 |     3   (0)| 00:00:01 |
|   1 |  SORT GROUP BY               |               |     1 |    19 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL |     4 |    76 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0021790  |     4 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1) 

In the above, notice that the  predicates for the PURC_ORDER_ID and PURC_ORDER_LINE_NO columns were pushed into the inline view, or more correctly, the inline view was completely eliminated by the query optimizer (notice that there is no line in the execution plan showing VIEW as we saw in the previous examples above).

OK, so Oracle’s query optimizer worked as I had hoped with the inline view, but what about a statically defined view – would it also execute the SQL statement efficiently?  Let’s create the view:

CREATE VIEW TEST_LA AS
SELECT
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO,
  LISTAGG(TO_CHAR(USER_RECEIVED_QTY,'999,990.00'), CHR(13)||CHR(10)) WITHIN GROUP (ORDER BY DEL_SCHED_LINE_NO) LISTING
FROM
  PURC_LINE_DEL
GROUP BY
  PURC_ORDER_ID,
  PURC_ORDER_LINE_NO; 

Now let’s try the query with AUTOTRACE TRACEONLY EXPLAIN enabled:

SELECT
  LISTING
FROM
  TEST_LA
WHERE
  PURC_ORDER_ID='146825'
  AND PURC_ORDER_LINE_NO=1;

Plan hash value: 3148992471

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |    19 |     3   (0)| 00:00:01 |
|   1 |  SORT GROUP BY               |               |     1 |    19 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL |     4 |    76 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0021790  |     4 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PURC_ORDER_ID"='146825' AND "PURC_ORDER_LINE_NO"=1) 

The execution plan above shows the same cost (3) that we saw earlier, and once again the view was eliminated by Oracle’s query optimizer (as indicated by the absence of a VIEW line in the execution plan), allowing the query to execute quickly since the predicates would be applied early in the query execution.

So, Oracle Database 11.2.0.1 and above provide an efficient way to retrieve the information that we need in a comma (or CRLF) delimited list for the report.  When the SQL statement is added to the report (defined as an “extended query” in the ERP package), the SQL statement looks like the following (the reporting package does not permit line breaks in the SQL statement):

SELECT (LISTAGG(TO_CHAR(USER_RECEIVED_QTY,'999,990.00'), CHR(13)||CHR(10)) WITHIN GROUP (ORDER BY DEL_SCHED_LINE_NO)) FROM PURC_LINE_DEL WHERE PURC_ORDER_ID= : PO_ID AND PURC_ORDER_LINE_NO= :LN_LINE_NO GROUP BY PURC_ORDER_ID , PURC_ORDER_LINE_NO

Oddly enough, by enclosing the LISTAGG analytic funtion in the SQL statement with ( ), the reporting package did not require me to use a statically defined view for the complex portion of the query.  But even if the statically defined view was required, as shown above, the query would still have executed efficiently.





SELECT Statement is Fast, INSERT INTO Using the SELECT Statement is Brutally Slow 2

18 12 2010

December 18, 2010 (Modified December 19, 2010)

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

The previous blog article in this series resulted in several interesting comments with very useful advice.  But was the performance issue just an isolated case, one that only happens in one database, with one Oracle Database release version?  We need a test case to determine how frequently this problem may occur.  First, we will create two tables, table T1 will emulate table EDI830 from the previous blog article, and table T1_LINES will emulate the EDI830_LINES table.  A PADDING column will be used to make the average row length roughly equivalent to that of the original tables:

CREATE TABLE T1(
  CUSTPO VARCHAR2(12) NOT NULL,
  RELID VARCHAR2(12) NOT NULL,
  PADDING VARCHAR2(40),
  CONSTRAINT T1_PK PRIMARY KEY (CUSTPO,RELID));

CREATE TABLE T1_LINES(
  CUSTPO VARCHAR2(12) NOT NULL,
  RELID VARCHAR2(12) NOT NULL,
  FCSTQTY NUMBER NOT NULL,
  FC VARCHAR2(1),
  FCDUEDATE DATE NOT NULL,
  CONSTRAINT T1_LINES_PK PRIMARY KEY (CUSTPO,RELID,FCDUEDATE,FCSTQTY)); 

Next, the table T1 is populated in a repeatable way that as best as possible simulates how the original table was actually populated in production:

INSERT INTO T1
SELECT
  RPAD(CHR(MOD(ROWNUM-1,26)+65)||TO_CHAR(ROWNUM),12,'A') CUSTPO,
  LPAD('1',12,'0') RELID,
  LPAD('A',40,'A') PADDING
FROM
  DUAL
CONNECT BY
  LEVEL<=700;

INSERT INTO
  T1
SELECT
  CUSTPO,
  LPAD(TO_CHAR(V2.RN+1),12,'0') RELID,
  LPAD('A',40,'A') PADDING
FROM
  (SELECT /*+ NO_MERGE */
    CUSTPO,
    PADDING,
    MOD(ROWNUM,10)*30+19 RN
  FROM
    (SELECT /*+ NO_MERGE */
      CUSTPO,
      PADDING
    FROM
      T1
    ORDER BY
      CUSTPO)) V1,
  (SELECT /*+ NO_MERGE */
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=319) V2
WHERE
  V2.RN<=V1.RN; 

If we select from table T1, we are able to see how the row counts are distributed in an unequal, yet repeatable way, with 700 distinct customer purchase order numbers: 

SELECT
  CUSTPO,
  MIN(RELID),
  MAX(RELID)
FROM
  T1
GROUP BY
  CUSTPO
ORDER BY
  CUSTPO;

CUSTPO       MIN(RELID)   MAX(RELID)
------------ ------------ ------------
A105AAAAAAAA 000000000001 000000000050
A131AAAAAAAA 000000000001 000000000080
A157AAAAAAAA 000000000001 000000000110
A183AAAAAAAA 000000000001 000000000140
A1AAAAAAAAAA 000000000001 000000000170
A209AAAAAAAA 000000000001 000000000200
A235AAAAAAAA 000000000001 000000000230
A261AAAAAAAA 000000000001 000000000260
...

Z624AAAAAAAA 000000000001 000000000230
Z650AAAAAAAA 000000000001 000000000260
Z676AAAAAAAA 000000000001 000000000290
Z78AAAAAAAAA 000000000001 000000000020

700 rows selected. 

Just as a quick experiment, let’s take a look at the output of the SQL statement that will be used to populate the line detail table (T1_LINES) for the customer POs:

SELECT
  CUSTPO,
  RELID,
  10 FCSTQTY,
  'A' FC,
  TO_DATE('01-JAN-2011','DD-MON-YYYY') + (V2.RN*3 - V1.REL*2) FCDUEDATE
FROM
  (SELECT /*+ NO_MERGE */
    CUSTPO,
    RELID,
    ROW_NUMBER() OVER (PARTITION BY RELID ORDER BY CUSTPO) RN,
    ROW_NUMBER() OVER (PARTITION BY CUSTPO ORDER BY RELID DESC) REL
  FROM
    T1
  WHERE
    CUSTPO IN ('Z78AAAAAAAAA','Y155AAAAAAAA')) V1,
  (SELECT /*+ NO_MERGE */
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=100) V2
WHERE
  V2.RN < (MOD(V1.RN,100)+10)
ORDER BY
  CUSTPO,
  RELID,
  5;

CUSTPO       RELID           FCSTQTY F FCDUEDATE
------------ ------------ ---------- - ---------
Y155AAAAAAAA 000000000001         10 A 26-SEP-10
Y155AAAAAAAA 000000000001         10 A 29-SEP-10
Y155AAAAAAAA 000000000001         10 A 02-OCT-10
Y155AAAAAAAA 000000000001         10 A 05-OCT-10
Y155AAAAAAAA 000000000001         10 A 08-OCT-10
Y155AAAAAAAA 000000000001         10 A 11-OCT-10
Y155AAAAAAAA 000000000001         10 A 14-OCT-10
Y155AAAAAAAA 000000000001         10 A 17-OCT-10
Y155AAAAAAAA 000000000001         10 A 20-OCT-10
Y155AAAAAAAA 000000000001         10 A 23-OCT-10
Y155AAAAAAAA 000000000002         10 A 28-SEP-10
Y155AAAAAAAA 000000000002         10 A 01-OCT-10
...

Z78AAAAAAAAA 000000000020         10 A 17-JAN-11
Z78AAAAAAAAA 000000000020         10 A 20-JAN-11
Z78AAAAAAAAA 000000000020         10 A 23-JAN-11
Z78AAAAAAAAA 000000000020         10 A 26-JAN-11
Z78AAAAAAAAA 000000000020         10 A 29-JAN-11
Z78AAAAAAAAA 000000000020         10 A 01-FEB-11

720 rows selected.

In the above, you can see how the dates will automatically shift from one release of a customer PO to the next, as well as the order in which the rows will be inserted.  The actual dates and number of rows for these two customer POs will actually be a bit different when the rows are inserted into the table, so use the above as just an illustration.

Next, let’s insert 6,079,050 rows into the line detail table in a repeatable way:

INSERT INTO
  T1_LINES
SELECT
  CUSTPO,
  RELID,
  10 FCSTQTY,
  'A' FC,
  TO_DATE('01-JAN-2011','DD-MON-YYYY') + (V2.RN*3 - V1.REL*2) FCDUEDATE
FROM
  (SELECT /*+ NO_MERGE */
    CUSTPO,
    RELID,
    ROW_NUMBER() OVER (PARTITION BY RELID ORDER BY CUSTPO) RN,
    ROW_NUMBER() OVER (PARTITION BY CUSTPO ORDER BY RELID DESC) REL
  FROM
    T1) V1,
  (SELECT /*+ NO_MERGE */
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=100) V2
WHERE
  V2.RN < (MOD(V1.RN,100)+10)
ORDER BY
  RELID,
  CUSTPO,
  5; 

COMMIT;

Now let’s create table T2 (simulates table CUSTOMER_ORDER) and table T2_LINES (simulates table CUST_ORDER_LINE).  Table T2_LINES will be populated with 224,700 rows:

CREATE TABLE T2(
  ID VARCHAR2(15),
  CUSTOMER_ID VARCHAR2(15),
  CUSTOMER_PO_REF VARCHAR2(40),
  PADDING VARCHAR2(150),
  PRIMARY KEY(ID));

CREATE TABLE T2_LINES(
  CUST_ORDER_ID VARCHAR2(15),
  LINE_NO NUMBER,
  PART_ID VARCHAR2(30),
  DESIRED_SHIP_DATE DATE,
  PADDING VARCHAR2(100),
  PRIMARY KEY(CUST_ORDER_ID,LINE_NO)); 

INSERT INTO
  T2
SELECT DISTINCT
  CUSTPO ID,
  'AAAAAAAAAA' CUSTOMER_ID,
  CUSTPO CUSTOMER_PO_REF,
  RPAD('A',150,'A') PADDING
FROM
  T1
ORDER BY
  DBMS_RANDOM.VALUE;

INSERT INTO
  T2_LINES
SELECT
  CUST_ORDER_ID,
  ROW_NUMBER() OVER (PARTITION BY CUST_ORDER_ID ORDER BY DESIRED_SHIP_DATE) LINE_NO,
  RPAD(TO_CHAR((MOD(ROWNUM,3000)+1) * 33)||'PART',10,'A') PART_ID,
  DESIRED_SHIP_DATE,
  RPAD('A',100,'A') PADDING
FROM 
  (SELECT DISTINCT
    CUSTPO CUST_ORDER_ID,
    FCDUEDATE DESIRED_SHIP_DATE   
  FROM
    T1_LINES
  WHERE
    FCDUEDATE<= TO_DATE('15-JAN-2011','DD-MON-YYYY')) V1;

COMMIT;

Finally, we will gather statistics for the tables and the primary key indexes:

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1_LINES',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2_LINES',CASCADE=>TRUE) 

We are now able to start transforming the original query sections, these sections will be listed one at a time, as they were in the earlier article:

    (SELECT
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E 

 The inline view that uses the above inline view: 

  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN TO_DATE('01-JAN-2011','DD-MON-YYYY')-365 AND TO_DATE('01-JAN-2011','DD-MON-YYYY')+1200
    AND EL.FCSTQTY>0) E 

The inline view for the customer orders:

  (SELECT
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO 

Finally, the full SQL statement with all of the inline views joined:

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE; 

Next, we need a simple table to hold the contents of the INSERT INTO… SELECT statement:

CREATE TABLE T1_DEST (
  PART_ID VARCHAR2(30),
  REQUIRED_DATE DATE,
  QTY NUMBER,
  PLAN_LEVEL NUMBER); 

Now the script that will actually perform the test with different values for OPTIMIZER_FEATURES_ENABLE, OPTIMIZER_INDEX_COST_ADJ, and OPTIMIZER_INDEX_CACHING (Edit Dec 19, 2010: the original version of the script set the TRACEFILE_IDENTIFIER value identically for the SELECT and the INSERT INTO versions of the SQL statement when the OPTIMIZER_FEATURES_ENABLED parameter was greater than 10.1.0.4, and that error resulted in the SELECT and the INSERT INTO execution plans appearing in the same trace file, rather than in separate trace files – the script below has been corrected):


SET TIMING ON
SET ARRAYSIZE 1000
SET AUTOTRACE TRACEONLY STATISTICS

SPOOL Capture_Results.txt

ALTER SESSION SET OPTIMIZER_INDEX_CACHING=0;
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='10.1.0.4';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_10.1.0.4';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_10.1.0.4';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='10.2.0.2';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_10.2.0.2';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_10.2.0.2';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='11.1.0.7';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_11.1.0.7';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_11.1.0.7';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='11.2.0.1';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_11.2.0.1';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_11.2.0.1';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+  */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

SPOOL Capture_Results80.txt

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_INDEX_CACHING=100;
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=80;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='10.1.0.4';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_10.1.0.4_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_10.1.0.4_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='10.2.0.2';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_10.2.0.2_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_10.2.0.2_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='11.1.0.7';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_11.1.0.7_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_11.1.0.7_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

TRUNCATE TABLE T1_DEST;

ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='11.2.0.1';
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'SELECT_11.2.0.1_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'INSERT_11.2.0.1_80';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

INSERT INTO T1_DEST
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT /*+ _80 */
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      T1 E
    GROUP BY
      E.CUSTPO) E,
    T1_LINES EL
  WHERE
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT /*+  */
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    T2 CO,
    T2_LINES COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;

ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT OFF';

SPOOL OFF

For this blog article, I decided to use the Enterprise Edition of Oracle Database 11.2.0.1, with the following manually set system (CPU) statistics (as printed in the 10053 trace files):

SYSTEM STATISTICS INFORMATION
-----------------------------
  Using WORKLOAD Stats
  CPUSPEED: 2664 millions instructions/sec
  SREADTIM: 8.000000 milliseconds
  MREADTIM: 10.000000 millisecons
  MBRC: 16 blocks
  MAXTHR: 19181568 bytes/sec
  SLAVETHR: -1 bytes/sec 

The tablespace used for the test case uses ASSM with auto-allocated extent sizes, and the following instance-wide parameters specified: 

SGA_TARGET =          12000M
PGA_AGGREGATE_TARGET = 2000M
OPTIMIZER_MODE =    ALL_ROWS

Just for fun, for now I will show you only the first two execution plans from the 10053 trace files:

SELECT_10.1.0.4 (OPTIMIZER_FEATURES_ENABLE=’10.1.0.4′, OPTIMIZER_INDEX_CACHING=0, OPTIMIZER_INDEX_COST_ADJ=100), completed in 0.23 seconds when selecting 10,478 rows:

--------------------------------------------------+-----------------------------------+
| Id  | Operation                    | Name       | Rows  | Bytes | Cost  | Time      |
--------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT             |            |       |       |  1425 |           |
| 1   |  SORT GROUP BY               |            |   487 |   46K |  1425 |  00:00:12 |
| 2   |   FILTER                     |            |       |       |       |           |
| 3   |    HASH JOIN RIGHT OUTER     |            |   19K | 1884K |  1423 |  00:00:12 |
| 4   |     VIEW                     |            |   700 |   27K |   368 |  00:00:03 |
| 5   |      SORT GROUP BY           |            |   700 |   40K |   368 |  00:00:03 |
| 6   |       HASH JOIN              |            |  146K | 8475K |   362 |  00:00:03 |
| 7   |        TABLE ACCESS FULL     | T2         |   700 |   18K |     3 |  00:00:01 |
| 8   |        TABLE ACCESS FULL     | T2_LINES   |  146K | 4676K |   358 |  00:00:03 |
| 9   |     NESTED LOOPS             |            |   19K | 1126K |  1055 |  00:00:09 |
| 10  |      VIEW                    |            |   700 |   14K |    65 |  00:00:01 |
| 11  |       SORT GROUP BY          |            |   700 |   18K |    65 |  00:00:01 |
| 12  |        FILTER                |            |       |       |       |           |
| 13  |         INDEX FAST FULL SCAN | T1_PK      |  106K | 2755K |    60 |  00:00:01 |
| 14  |      INDEX RANGE SCAN        | T1_LINES_PK|    28 |  1036 |     3 |  00:00:01 |
--------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - filter(("EL"."FCDUEDATE">COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365) AND COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365)<SYSDATE@!+1200))
3 - access("E"."CUSTPO"="CO"."CUSTOMER_PO_REF")
6 - access("CO"."ID"="COL"."CUST_ORDER_ID")
7 - filter("CUSTOMER_PO_REF" IS NOT NULL)
8 - filter("COL"."DESIRED_SHIP_DATE">=SYSDATE@!-365)
12 - filter(SYSDATE@!-365<=SYSDATE@!+1200)
14 - access("E"."CUSTPO"="EL"."CUSTPO" AND "E"."RELID"="EL"."RELID" AND "EL"."FCDUEDATE">=SYSDATE@!-365 AND "EL"."FCSTQTY">0 AND "EL"."FCDUEDATE"<=SYSDATE@!+1200)
14 - filter("EL"."FCSTQTY">0) 

INSERT_10.1.0.4 (OPTIMIZER_FEATURES_ENABLE=’10.1.0.4′, OPTIMIZER_INDEX_CACHING=0, OPTIMIZER_INDEX_COST_ADJ=100), completed in… 30 minutes, 18.01 seconds when inserting 10,478 rows:

-------------------------------------------------+-----------------------------------+
| Id  | Operation                      | Name    | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------+-----------------------------------+
| 0   | INSERT STATEMENT               |         |       |       | 4967K |           |
| 1   |  LOAD TABLE CONVENTIONAL       |         |       |       |       |           |
| 2   |   SORT GROUP BY                |         |   842 |   60K | 4967K |  11:18:11 |
| 3   |    FILTER                      |         |       |       |       |           |
| 4   |     HASH JOIN OUTER            |         |   842 |   60K | 4967K |  11:18:11 |
| 5   |      VIEW                      |         |   842 |   28K | 4967K |  11:18:08 |
| 6   |       FILTER                   |         |       |       |       |           |
| 7   |        SORT GROUP BY           |         |   842 |   53K | 4967K |  11:18:08 |
| 8   |         FILTER                 |         |       |       |       |           |
| 9   |          HASH JOIN             |         |  852M |   54G |  7035 |  00:00:57 |
| 10  |           INDEX FAST FULL SCAN | T1_PK   |  106K | 2755K |    60 |  00:00:01 |
| 11  |           TABLE ACCESS FULL    | T1_LINES| 5631K |  214M |  3388 |  00:00:28 |
| 12  |      VIEW                      |         |   700 |   27K |   368 |  00:00:03 |
| 13  |       SORT GROUP BY            |         |   700 |   40K |   368 |  00:00:03 |
| 14  |        HASH JOIN               |         |  146K | 8475K |   362 |  00:00:03 |
| 15  |         TABLE ACCESS FULL      | T2      |   700 |   18K |     3 |  00:00:01 |
| 16  |         TABLE ACCESS FULL      | T2_LINES|  146K | 4676K |   358 |  00:00:03 |
-------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
3 - filter("E"."FCDUEDATE">COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365))
4 - access("E"."CUSTPO"="CO"."CUSTOMER_PO_REF")
6 - filter("EL"."RELID"=MAX("RELID"))
8 - filter(SYSDATE@!-365<=SYSDATE@!+1200)
9 - access("E"."CUSTPO"="EL"."CUSTPO")
11 - filter(("EL"."FCDUEDATE">=SYSDATE@!-365 AND "EL"."FCDUEDATE"<=SYSDATE@!+1200 AND "EL"."FCSTQTY">0))
14 - access("CO"."ID"="COL"."CUST_ORDER_ID")
15 - filter("CUSTOMER_PO_REF" IS NOT NULL)
16 - filter("COL"."DESIRED_SHIP_DATE">=SYSDATE@!-365) 

——————-

Just one more example of the value of test cases.  By the way, make certain that the air conditioning is working well in the room where your test server is located – your test server could be swamped for a couple of hours while the table creation scripts run and the actual test script is executed.  So, what is ahead in part three of this blog article series?

(P.S. Did anyone’s test result for the first INSERT INTO statement complete in less than 30 minutes?)





Graphical Work Center Utilization – Creating the Demo Data and Active Server Page

1 09 2010

September 1, 2010

Today’s blog article provides a graphical view of production work areas on a factory floor, providing feedback to indicate when the production area is in use.  The blog article includes a classic ASP web page which uses VBScript to write the web page on the fly to the browser on the client computer.  The web page written to the client browser automatically refreshes itself every 30 seconds, automatically advancing the view time by 15 minutes, displaying the production work areas that were in use during the new view time (the time displayed at the top of the page).

The ASP web page is the simple part of today’s blog article (although enabling classic ASP support may be a little challenging), while creating the demo data is actually the challenging portion of the article.  First, we need a table that will define the production work areas and the locations of those areas on the factory floor:

CREATE TABLE RESOURCE_LOCATION_DEMO (
  RESOURCE_ID VARCHAR2(15),
  DESCRIPTION VARCHAR2(30),
  LOCATION_LEFT NUMBER(12,4),
  LOCATION_TOP NUMBER(12,4),
  LOCATION_WIDTH NUMBER(12,4),
  LOCATION_HEIGHT NUMBER(12,4),
  PRIMARY KEY (RESOURCE_ID));

To keep things interesting, I do not want to just place the first production work area next to the second, the second next to the third, etc.  Instead, I want to randomly locate the production work areas on the factory floor, making certain that no two work areas overlap.  We can accomplish this by creating a list of numbers and ordering the numbers in a random sequence, like this:

SELECT
  ROWNUM RN
FROM
  DUAL
CONNECT BY
  LEVEL<=200
ORDER BY
  DBMS_RANDOM.VALUE;

  RN
----
 191
 165
 122
  12
  48
  27
 104
...
 198
 168
 150

200 rows selected.

Now, to locate the production work areas, imagine that we permitted 10 work areas horizontally (along the X axis) across the factory floor.  We can use the above number sequence along with the MOD function to determine the horizontal location of the work areas, and the FLOOR function to determine the vertical location of the work areas (note that each time we run this SQL statement will we receive different results):

SELECT
  'MA'||TO_CHAR(ROWNUM) RESOURCE_ID,
  'Shop Floor Machine '||TO_CHAR(ROWNUM) DESCRIPTION,
  MOD(RN,10) BOX_LEFT,
  FLOOR(RN/10) BOX_TOP
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=200
  ORDER BY
    DBMS_RANDOM.VALUE)
WHERE
  ROWNUM<=50;

RESOURCE_ID DESCRIPTION                 BOX_LEFT    BOX_TOP
----------- ------------------------- ---------- ----------
MA1         Shop Floor Machine 1               4         14
MA2         Shop Floor Machine 2               9          6
MA3         Shop Floor Machine 3               5          2
MA4         Shop Floor Machine 4               5          9
MA5         Shop Floor Machine 5               7         18
MA6         Shop Floor Machine 6               9          4
MA7         Shop Floor Machine 7               0          8
MA8         Shop Floor Machine 8               6          6
MA9         Shop Floor Machine 9               5          5
MA10        Shop Floor Machine 10              7         15
...
MA49        Shop Floor Machine 49              2         11
MA50        Shop Floor Machine 50              8          7

It would be too boring to assume that each of the production work areas is exactly the same width and height, so we will add a little more randomization.  Additionally, I want each production area to be up to 1.5 units wide and up to 1.0 units tall, both offset 0.25 units from the top-left (we are dealing with screen coordinates here, where positive Y is the same as mathematical -Y).  While there are up to 200 locations on the factory floor for work areas, we will only define 50 work areas (controlled by the ROWNUM<=50 predicate at the end of the SQL statement):

SELECT
  'MA'||TO_CHAR(ROWNUM) RESOURCE_ID,
  'Shop Floor Machine '||TO_CHAR(ROWNUM) DESCRIPTION,
  (MOD(RN,10))*1.5 + 0.25 LOCATION_LEFT,
  (FLOOR(RN/10))*1.0 + 0.25 LOCATION_TOP,
  ROUND(1.5*DBMS_RANDOM.VALUE,4) LOCATION_WIDTH,
  ROUND(1.0*DBMS_RANDOM.VALUE,4) LOCATION_HEIGHT
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=200
  ORDER BY
    DBMS_RANDOM.VALUE)
WHERE
  ROWNUM<=50;

RESOURCE_ID DESCRIPTION               LOCATION_LEFT LOCATION_TOP LOCATION_WIDTH LOCATION_HEIGHT
----------- ------------------------- ------------- ------------ -------------- ---------------
MA1         Shop Floor Machine 1               3.25        18.25         1.2386           .7948
MA2         Shop Floor Machine 2               4.75        11.25          .6078           .9578
MA3         Shop Floor Machine 3               1.75        12.25          .5318            .457
MA4         Shop Floor Machine 4               3.25        13.25         1.2908           .9813
MA5         Shop Floor Machine 5                .25        16.25          .3245           .4644
MA6         Shop Floor Machine 6              12.25        15.25           .239           .3822
MA7         Shop Floor Machine 7               1.75        18.25          .0159           .8868
MA8         Shop Floor Machine 8               1.75        16.25          .3948           .8511
MA9         Shop Floor Machine 9              12.25         6.25          .4856           .3356
MA10        Shop Floor Machine 10             13.75        11.25         1.2619           .6124
...
MA49        Shop Floor Machine 49              7.75        16.25          .6664           .6938
MA50        Shop Floor Machine 50              9.25        15.25         1.3449           .6606

Now that we have tested the results, let’s insert a new set of similar random values into the RESOURCE_LOCATION_DEMO table, and display some of the inserted rows:

INSERT INTO
  RESOURCE_LOCATION_DEMO 
SELECT
  'MA'||TO_CHAR(ROWNUM) RESOURCE_ID,
  'Shop Floor Machine '||TO_CHAR(ROWNUM) DESCRIPTION,
  (MOD(RN,10))*1.5 + 0.25 LOCATION_LEFT,
  (FLOOR(RN/10))*1.0 + 0.25 LOCATION_TOP,
  ROUND(1.5*DBMS_RANDOM.VALUE,4) LOCATION_WIDTH,
  ROUND(1.0*DBMS_RANDOM.VALUE,4) LOCATION_HEIGHT
FROM
  (SELECT
    ROWNUM RN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=200
  ORDER BY
    DBMS_RANDOM.VALUE)
WHERE
  ROWNUM<=50;

COMMIT;

SELECT
  *
FROM
  RESOURCE_LOCATION_DEMO;

RESOURCE_ID DESCRIPTION               LOCATION_LEFT LOCATION_TOP LOCATION_WIDTH LOCATION_HEIGHT
----------- ------------------------- ------------- ------------ -------------- ---------------
MA1         Shop Floor Machine 1              10.75        13.25           .104           .2165
MA2         Shop Floor Machine 2               7.75        18.25         1.2291            .478
MA3         Shop Floor Machine 3               9.25        16.25          .3431           .4364
MA4         Shop Floor Machine 4               1.75        15.25          .3665           .7278
MA5         Shop Floor Machine 5               4.75        15.25          .6842           .4507
MA6         Shop Floor Machine 6               4.75        18.25          .1384           .6434
MA7         Shop Floor Machine 7               4.75         7.25          .7448           .2178
MA8         Shop Floor Machine 8               7.75          .25          .3756            .499
MA9         Shop Floor Machine 9               1.75        18.25         1.0155           .0769
MA10        Shop Floor Machine 10              7.75        13.25         1.1892           .7518
...
MA49        Shop Floor Machine 49              6.25         3.25           .278           .6513
MA50        Shop Floor Machine 50               .25        15.25          .5216           .9607

To translate the above storage units (maybe in inch scale) into screen units we will multiply the storage units by 96 (96 dots per inch) and then multiply by the zoom percent (75% = 0.75).

SELECT
  RESOURCE_ID,
  ROUND(LOCATION_LEFT*96 *0.75) LOCATION_LEFT,
  ROUND(LOCATION_TOP*96 *0.75) LOCATION_TOP,
  ROUND(LOCATION_WIDTH*96 *0.75) LOCATION_WIDTH,
  ROUND(LOCATION_HEIGHT*96 *0.75) LOCATION_HEIGHT
FROM
  RESOURCE_LOCATION_DEMO;

RESOURCE_ID LOCATION_LEFT LOCATION_TOP LOCATION_WIDTH LOCATION_HEIGHT
----------- ------------- ------------ -------------- ---------------
MA1                   774          954              7              16
MA2                   558         1314             88              34
MA3                   666         1170             25              31
MA4                   126         1098             26              52
MA5                   342         1098             49              32
MA6                   342         1314             10              46
MA7                   342          522             54              16
MA8                   558           18             27              36
MA9                   126         1314             73               6
MA10                  558          954             86              54
...

Next, we need a table to maintain the time periods in which each of the production work areas was in use, and by whom the work areas were used:

CREATE TABLE LABOR_TICKET_DEMO (
  TRANSACTION_ID NUMBER,
  RESOURCE_ID VARCHAR2(15),
  EMPLOYEE_ID VARCHAR2(15),
  CLOCK_IN DATE,
  CLOCK_OUT DATE,
  PRIMARY KEY (TRANSACTION_ID));

Let’s see if we are able to generate some random data for the table:

ALTER SESSION SET NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI';

SELECT
  ROWNUM TRANSACTION_ID,
  'MA'||TO_CHAR(ROUND(DBMS_RANDOM.VALUE(1,50))) RESOURCE_ID,
  'EMP'||TO_CHAR(ROUND(DBMS_RANDOM.VALUE(1,300))) EMPLOYEE_ID,
  TRUNC(SYSDATE)+DBMS_RANDOM.VALUE CLOCK_IN
FROM
  DUAL
CONNECT BY
  LEVEL<=10;

TRANSACTION_ID RESOURCE_ID EMPLOYEE_ID CLOCK_IN
-------------- ----------- ----------- -----------------
             1 MA29        EMP50       01-SEP-2010 01:56
             2 MA35        EMP181      01-SEP-2010 10:06
             3 MA13        EMP172      01-SEP-2010 17:40
             4 MA21        EMP182      01-SEP-2010 09:00
             5 MA14        EMP80       01-SEP-2010 09:53
             6 MA4         EMP80       01-SEP-2010 19:04
             7 MA7         EMP110      01-SEP-2010 14:34
             8 MA45        EMP19       01-SEP-2010 22:05
             9 MA10        EMP207      01-SEP-2010 21:51
            10 MA46        EMP127      01-SEP-2010 16:49

That worked, but note that we did not generate a CLOCK_OUT time – we want to make certain that the CLOCK_OUT time is after the CLOCK_IN time, but we simply cannot do that with the above SQL statement as written.  We slide the above into an inline view and then set the CLOCK_OUT time to be up to 12 hours after the CLOCK_IN time (DBMS_RANDOM.VALUE by default returns a value between 0 and 1, so if we divide that value by 2, we can add up to 1/2 of a day to the CLOCK_IN date and time):

INSERT INTO
  LABOR_TICKET_DEMO
SELECT
  TRANSACTION_ID,
  RESOURCE_ID,
  EMPLOYEE_ID,
  CLOCK_IN,
  CLOCK_IN + (DBMS_RANDOM.VALUE/2) CLOCK_OUT
FROM
  (SELECT
    ROWNUM TRANSACTION_ID,
    'MA'||TO_CHAR(ROUND(DBMS_RANDOM.VALUE(1,50))) RESOURCE_ID,
    'EMP'||TO_CHAR(ROUND(DBMS_RANDOM.VALUE(1,300))) EMPLOYEE_ID,
    TRUNC(SYSDATE)+DBMS_RANDOM.VALUE CLOCK_IN
  FROM
    DUAL
  CONNECT BY
    LEVEL<=1000);

COMMIT;

Let’s take a look at what made it into the table (your results will be different):

SELECT
  *
FROM
  LABOR_TICKET_DEMO
ORDER BY
  TRANSACTION_ID; 

TRANSACTION_ID RESOURCE_ID     EMPLOYEE_ID     CLOCK_IN          CLOCK_OUT
-------------- --------------- --------------- ----------------- -----------------
             1 MA34            EMP49           01-SEP-2010 20:32 02-SEP-2010 08:18
             2 MA47            EMP230          01-SEP-2010 20:08 02-SEP-2010 03:06
             3 MA20            EMP257          01-SEP-2010 02:17 01-SEP-2010 05:44
             4 MA21            EMP129          01-SEP-2010 09:37 01-SEP-2010 15:41
             5 MA18            EMP214          01-SEP-2010 18:57 01-SEP-2010 20:57
             6 MA46            EMP173          01-SEP-2010 05:51 01-SEP-2010 15:32
             7 MA34            EMP224          01-SEP-2010 20:23 02-SEP-2010 08:17
             8 MA31            EMP8            01-SEP-2010 02:56 01-SEP-2010 14:02
             9 MA37            EMP178          01-SEP-2010 09:28 01-SEP-2010 16:03
            10 MA8             EMP31           01-SEP-2010 20:17 02-SEP-2010 05:51
...
           999 MA43            EMP138          01-SEP-2010 05:07 01-SEP-2010 05:23
          1000 MA2             EMP235          01-SEP-2010 05:29 01-SEP-2010 13:28

We now have 1000 transactions scattered across the 50 work areas (RESOURCE_ID column).  What we next want to determine is which of the work areas was in use at a specific time of the day.  Because we eventually will want only one row per unique RESOURCE_ID value, we will use the ROW_NUMBER analytic function to number each of the rows within each unique RESOURCE_ID value:

SELECT
  RESOURCE_ID,
  EMPLOYEE_ID,
  TRUNC(SYSDATE) + (1.25)/24 CHECK_TIME,
  CLOCK_IN,
  CLOCK_OUT,
  ROW_NUMBER() OVER (PARTITION BY RESOURCE_ID ORDER BY CLOCK_IN) RN
FROM
  LABOR_TICKET_DEMO
WHERE
  CLOCK_IN<=TRUNC(SYSDATE) + (1.25)/24
  AND CLOCK_OUT>TRUNC(SYSDATE) + (1.25)/24
ORDER BY
  RESOURCE_ID,
  CLOCK_IN;

RESOURCE_ID  EMPLOYEE_ID  CHECK_TIME        CLOCK_IN          CLOCK_OUT         RN
------------ ------------ ----------------- ----------------- ----------------- --
MA10         EMP192       01-SEP-2010 01:15 01-SEP-2010 00:21 01-SEP-2010 05:44  1
MA10         EMP233       01-SEP-2010 01:15 01-SEP-2010 00:23 01-SEP-2010 02:42  2
MA16         EMP114       01-SEP-2010 01:15 01-SEP-2010 00:21 01-SEP-2010 04:48  1
MA18         EMP261       01-SEP-2010 01:15 01-SEP-2010 00:18 01-SEP-2010 07:02  1
MA18         EMP133       01-SEP-2010 01:15 01-SEP-2010 00:32 01-SEP-2010 04:35  2
MA2          EMP62        01-SEP-2010 01:15 01-SEP-2010 01:14 01-SEP-2010 12:03  1
MA21         EMP235       01-SEP-2010 01:15 01-SEP-2010 00:05 01-SEP-2010 10:42  1
MA22         EMP4         01-SEP-2010 01:15 01-SEP-2010 00:01 01-SEP-2010 06:27  1
MA22         EMP300       01-SEP-2010 01:15 01-SEP-2010 01:12 01-SEP-2010 11:50  2
MA23         EMP135       01-SEP-2010 01:15 01-SEP-2010 00:35 01-SEP-2010 05:19  1
MA25         EMP35        01-SEP-2010 01:15 01-SEP-2010 00:20 01-SEP-2010 06:58  1
MA28         EMP298       01-SEP-2010 01:15 01-SEP-2010 00:52 01-SEP-2010 06:27  1
MA30         EMP72        01-SEP-2010 01:15 01-SEP-2010 00:56 01-SEP-2010 07:28  1
MA32         EMP84        01-SEP-2010 01:15 01-SEP-2010 01:00 01-SEP-2010 05:25  1
MA34         EMP299       01-SEP-2010 01:15 01-SEP-2010 00:31 01-SEP-2010 12:04  1
MA38         EMP268       01-SEP-2010 01:15 01-SEP-2010 00:31 01-SEP-2010 04:15  1
MA38         EMP278       01-SEP-2010 01:15 01-SEP-2010 00:32 01-SEP-2010 04:50  2
MA38         EMP176       01-SEP-2010 01:15 01-SEP-2010 01:01 01-SEP-2010 04:01  3
MA4          EMP257       01-SEP-2010 01:15 01-SEP-2010 00:10 01-SEP-2010 10:45  1
MA40         EMP231       01-SEP-2010 01:15 01-SEP-2010 00:58 01-SEP-2010 11:01  1
MA43         EMP65        01-SEP-2010 01:15 01-SEP-2010 00:54 01-SEP-2010 09:29  1
MA44         EMP18        01-SEP-2010 01:15 01-SEP-2010 00:07 01-SEP-2010 03:30  1
MA46         EMP36        01-SEP-2010 01:15 01-SEP-2010 00:40 01-SEP-2010 04:57  1
MA48         EMP61        01-SEP-2010 01:15 01-SEP-2010 00:27 01-SEP-2010 10:20  1
MA48         EMP169       01-SEP-2010 01:15 01-SEP-2010 00:44 01-SEP-2010 01:27  2
MA5          EMP147       01-SEP-2010 01:15 01-SEP-2010 00:02 01-SEP-2010 04:48  1
MA6          EMP132       01-SEP-2010 01:15 01-SEP-2010 00:34 01-SEP-2010 09:42  1

27 rows selected.

In some cases we have up to three employees working in a work area at 1:15AM (the time of day is indicated by the 1.25 value in the SQL statement).  Now, lets eliminate the duplicate rows, leaving just the rows where the calculated RN column is equal to 1.  We will join the above SQL statement in an inline view to the RESOURCE_LOCATION_DEMO table and convert the production work area coordinates to screen coordinates, in this case 96 pixels per unit (inches) at a 75% zoom percent (we made this same screen coordinate conversion in a previous SQL statement above):

SELECT
  RL.RESOURCE_ID,
  RL.DESCRIPTION,
  ROUND(RL.LOCATION_LEFT*96 *0.75) LOCATION_LEFT,
  ROUND(RL.LOCATION_TOP*96 *0.75) LOCATION_TOP,
  ROUND(RL.LOCATION_WIDTH*96 *0.75) LOCATION_WIDTH,
  ROUND(RL.LOCATION_HEIGHT*96 *0.75) LOCATION_HEIGHT,
  LT.EMPLOYEE_ID,
  LT.CLOCK_IN,
  LT.CLOCK_OUT
FROM
  RESOURCE_LOCATION_DEMO RL,
  (SELECT
    RESOURCE_ID,
    EMPLOYEE_ID,
    CLOCK_IN,
    CLOCK_OUT,
    ROW_NUMBER() OVER (PARTITION BY RESOURCE_ID ORDER BY CLOCK_IN) RN
  FROM
    LABOR_TICKET_DEMO
  WHERE
    CLOCK_IN<=TRUNC(SYSDATE) + (1.25)/24
    AND CLOCK_OUT>TRUNC(SYSDATE) + (1.25)/24) LT
WHERE
  RL.RESOURCE_ID=LT.RESOURCE_ID(+)
  AND NVL(LT.RN,1)=1
ORDER BY
  RL.RESOURCE_ID;

RESOURCE_ID  DESCRIPTION                    LOCATION_LEFT LOCATION_TOP LOCATION_WIDTH LOCATION_HEIGHT EMPLOYEE_ID  CLOCK_IN          CLOCK_OUT
------------ ------------------------------ ------------- ------------ -------------- --------------- ------------ ----------------- -----------------
MA1          Shop Floor Machine 1                     774          954              7              16
MA10         Shop Floor Machine 10                    558          954             86              54 EMP192       01-SEP-2010 00:21 01-SEP-2010 05:44
MA11         Shop Floor Machine 11                    882         1098             29               1
MA12         Shop Floor Machine 12                    234          378             51              51
MA13         Shop Floor Machine 13                    882         1314             83              62
MA14         Shop Floor Machine 14                    558          378             38              61
MA15         Shop Floor Machine 15                    774          522             63              64
MA16         Shop Floor Machine 16                    126          666            103              55 EMP114       01-SEP-2010 00:21 01-SEP-2010 04:48
MA17         Shop Floor Machine 17                    558          234             94              30
MA18         Shop Floor Machine 18                    342          450             85              21 EMP261       01-SEP-2010 00:18 01-SEP-2010 07:02
MA19         Shop Floor Machine 19                    342          666             94              20
MA2          Shop Floor Machine 2                     558         1314             88              34 EMP62        01-SEP-2010 01:14 01-SEP-2010 12:03
MA20         Shop Floor Machine 20                    666          162             33              33
MA21         Shop Floor Machine 21                    774          306             66              22 EMP235       01-SEP-2010 00:05 01-SEP-2010 10:42
MA22         Shop Floor Machine 22                    990          378             78              71 EMP4         01-SEP-2010 00:01 01-SEP-2010 06:27
MA23         Shop Floor Machine 23                    666          666             50              37 EMP135       01-SEP-2010 00:35 01-SEP-2010 05:19
MA24         Shop Floor Machine 24                    990          810            107              45
...

From the above output, we know the screen coordinates of each production work area (RESOURCE_ID) and the first employee to use the production work area (and the employee was still using it) at 1:15AM.

For the next portion of this blog article, the portion that requires an ASP enabled web server, we need a couple of pictures (these were created using Microsoft Power Point):

Representing a production work center that is in use:

Representing a production work center that is idle:

The factory floor – the background area:

——

Now we need the classic ASP page code – note that the code syntax is very similar to that of the previous VBScript examples.  The script uses Response.Write to write information to the client computer’s web browser, and an embedded Java script to call the post event of the embedded HTML form to update the display as of time every 30 seconds (yes, I should have used bind variables in the SQL statement, but that would have required an extra 120 seconds to code and would have left you with nothing to improve):

<html>

<head>
<meta http-equiv="refresh" content="600">
<title>Graphical Work Center Utilization Animated</title>
</head>

<body>
    <%
    Dim strSQL

    Dim sglOriginLeft
    Dim sglOriginTop

    Dim sglZoom
    Dim strZoom

    Dim i
    Dim intWidth
    Dim intHeight
    Dim strOffset
    Dim sglOffset
    Dim varDateTime

    Dim snpData
    Dim dbDatabase

    Set dbDatabase = Server.CreateObject("ADODB.Connection")

    'Database configuration
    strUsername = "MyUsername"
    strPassword = "MyPassword"
    strDatabase = "MyDB"

    On Error Resume Next

    dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";"
    dbDatabase.Open

    'Should verify that the connection attempt was successful, but I will leave that for someone else to code

    Set snpData = Server.CreateObject("ADODB.Recordset")

    'Retrieve the last value for the time offset and the selected zoom percent
    strOffset = Request.Form("txtOffset")
    strZoom = Request.Form("cboZoom")

    'Convert back to a number
    sglOffset = CSng(strOffset)
    sglZoom = CSng(strZoom) / 100

    'Advance to the next 0.25 hour
    if (sglOffset = 0) or (sglOffset = 24) then
        sglOffset = 0.25
    else
        sglOffset = sglOffset + 0.25
    end if

    'Set the zoom percent, if not already set
    If sglZoom = 0 Then
        sglZoom = 0.5 '50% zoom
    End If

    varDateTime = DateAdd("n", sglOffset*60, Date) 'Create a printable version of the view date and time

    Response.Write "<form name=" & chr(34) & "reportform" & chr(34) & " method=" & chr(34) & "POST" & chr(34) & " action=" & chr(34) & "GraphicalWorkCenterUtilization.asp" & chr(34) & ">" & vbcrlf
    Response.Write varDateTime & "  " & vbCrLf
    Response.Write "&nbsp;&nbsp;&nbsp;Zoom Percent <select size=""1"" id=""cboZoom"" name=""cboZoom"" style=""width:50"">" & vbCrLf
    For i = 10 to 200 Step 10
        If sglZoom = i / 100 Then
            Response.Write "<option selected value=""" & cStr(i) & """>" & cStr(i) & "</option>"
        Else
            Response.Write "<option value=""" & cStr(i) & """>" & cStr(i) & "</option>"
        End If
    Next
    Response.Write "</select><br>"
    Response.Write "  <input type=" & chr(34) & "submit" & chr(34) & " value=" & chr(34) & "Update View" & chr(34) & " name=" & chr(34) & "cmdViewReport" & chr(34) & ">" & vbcrlf
    Response.Write "  <input type=" & chr(34) & "hidden" & chr(34) & " name=" & chr(34) & "txtOffset" & chr(34) & " size=" & chr(34) & "5" & chr(34) & " value=" & chr(34) & cStr(sglOffset) & chr(34) & ">" & vbcrlf
    Response.Write "</form>" & vbcrlf

    'The background image
    intWidth = Round(16 * 96 * sglZoom)
    intHeight = Round(20 * 96 * sglZoom)
    Response.Write "<img src=" & chr(34) & "http://hoopercharles.files.wordpress.com/2010/09/graphicalworkcenterutilizationbackground.jpg" & chr(34) & " width=" & chr(34) & cstr(intWidth) & chr(34) & " height=" & chr(34) & cstr(intheight) & chr(34) & " style=" & chr(34) & "position:absolute;top:50px;left:0px;z-index:-1" & chr(34) & "/>" & vbcrlf

    'The SQL statement developed earlier
    strSQL = "SELECT" & VBCrLf
    strSQL = strSQL & "  RL.RESOURCE_ID," & VBCrLf
    strSQL = strSQL & "  RL.DESCRIPTION," & VBCrLf
    strSQL = strSQL & "  ROUND(RL.LOCATION_LEFT*96 *" & cStr(sglZoom) & ") LOCATION_LEFT," & VBCrLf
    strSQL = strSQL & "  ROUND(RL.LOCATION_TOP*96 *" & cStr(sglZoom) & ") LOCATION_TOP," & VBCrLf
    strSQL = strSQL & "  ROUND(RL.LOCATION_WIDTH*96 *" & cStr(sglZoom) & ") LOCATION_WIDTH," & VBCrLf
    strSQL = strSQL & "  ROUND(RL.LOCATION_HEIGHT*96 *" & cStr(sglZoom) & ") LOCATION_HEIGHT," & VBCrLf
    strSQL = strSQL & "  LT.EMPLOYEE_ID," & VBCrLf
    strSQL = strSQL & "  LT.CLOCK_IN," & VBCrLf
    strSQL = strSQL & "  LT.CLOCK_OUT" & VBCrLf
    strSQL = strSQL & "FROM" & VBCrLf
    strSQL = strSQL & "  RESOURCE_LOCATION_DEMO RL," & VBCrLf
    strSQL = strSQL & "  (SELECT" & VBCrLf
    strSQL = strSQL & "    RESOURCE_ID," & VBCrLf
    strSQL = strSQL & "    EMPLOYEE_ID," & VBCrLf
    strSQL = strSQL & "    CLOCK_IN," & VBCrLf
    strSQL = strSQL & "    CLOCK_OUT," & VBCrLf
    strSQL = strSQL & "    ROW_NUMBER() OVER (PARTITION BY RESOURCE_ID ORDER BY CLOCK_IN) RN" & VBCrLf
    strSQL = strSQL & "  FROM" & VBCrLf
    strSQL = strSQL & "    LABOR_TICKET_DEMO" & VBCrLf
    strSQL = strSQL & "  WHERE" & VBCrLf
    strSQL = strSQL & "    CLOCK_IN<=TRUNC(SYSDATE) + (" & cStr(sglOffset) & ")/24" & VBCrLf
    strSQL = strSQL & "    AND CLOCK_OUT>TRUNC(SYSDATE) + (" & cStr(sglOffset) & ")/24) LT" & VBCrLf
    strSQL = strSQL & "WHERE" & VBCrLf
    strSQL = strSQL & "  RL.RESOURCE_ID=LT.RESOURCE_ID(+)" & VBCrLf
    strSQL = strSQL & "  AND NVL(LT.RN,1)=1" & VBCrLf
    strSQL = strSQL & "ORDER BY" & VBCrLf
    strSQL = strSQL & "  RL.RESOURCE_ID"

    snpData.open strSQL, dbDatabase

    If snpData.State = 1 then
        Response.Write "<B><font color=" & chr(34) & "#0000FF" & chr(34) & "><p style=" & chr(34) & "position:absolute;top:15px;left:500px" & chr(34) & ">" & FormatDateTime(cdate(snpData("START_TIME")),2) & " " & FormatDateTime(cdate(snpData("START_TIME")),4) & " - " & FormatDateTime(cdate(snpData("END_TIME")),4) & "</p></font></b>" & vbcrlf

        Do While Not snpData.EOF
            If Not(IsNull(snpData("employee_id"))) Then
                'A labor ticket was in process during this time period
                Response.Write "<img alt=" & Chr(34) & snpData("resource_id") & "  " & snpData("description") & vbCrlf & snpData("employee_id") & "(" & snpData("clock_in") & " - " & snpData("clock_out") & ")" & Chr(34) & " src=" & chr(34) & "http://hoopercharles.files.wordpress.com/2010/09/graphicalworkcenterutilizationrunning.jpg" & chr(34) & " width=" & chr(34) & cStr(snpData("location_width")) & chr(34) & " height=" & chr(34) & cStr(snpData("location_height")) & chr(34) & " style=" & chr(34) & "position:absolute;top:" & cStr(cLng(snpData("location_top")) + 40) & "px;left:" & cStr(snpData("location_left")) & "px" & chr(34) & "/>" & vbcrlf
                'Write the title down 1 pixel
                Response.Write "<B><font color=" & chr(34) & "#00FFFF" & chr(34) & "><p style=" & chr(34) & "position:absolute;top:" & cStr(Round(41 + CSng(snpData("location_top")))) & "px;left:" & cStr(Round(CSng(snpData("location_left")))) & "px" & chr(34) & ">" & snpData("resource_id") & "</p></font></b>" & vbcrlf
            Else
                'No labor ticket was in process during this time period
                Response.Write "<img alt=" & Chr(34) & snpData("resource_id") & "  " & snpData("description") & Chr(34) & " src=" & chr(34) & "http://hoopercharles.files.wordpress.com/2010/09/graphicalworkcenterutilizationstopped.jpg" & chr(34) & " width=" & chr(34) & cStr(snpData("location_width")) & chr(34) & " height=" & chr(34) & cStr(snpData("location_height")) & chr(34) & " style=" & chr(34) & "position:absolute;top:" & cStr(cLng(snpData("location_top")) + 40) & "px;left:" & cStr(snpData("location_left")) & "px" & chr(34) & "/>" & vbcrlf
                'Write the title down 1 pixel
                Response.Write "<B><font color=" & chr(34) & "#FF0000" & chr(34) & "><p style=" & chr(34) & "position:absolute;top:" & cStr(Round(41 + CSng(snpData("location_top")))) & "px;left:" & cStr(Round(CSng(snpData("location_left")))) & "px" & chr(34) & ">" & snpData("resource_id") & "</p></font></b>" & vbcrlf
            End If
            snpData.movenext
        Loop
    End if

    snpData.Close
    dbDatabase.Close

    Set snpData = Nothing
    Set dbDatabase = Nothing
    %>

    <script language="JavaScript1.2">
    function NextInterval(){
      reportform.submit();
    }

      setTimeout("NextInterval()",30000)
    </script>
</body>

</html>

GraphicalWorkCenterUtilization.asp (save as GraphicalWorkCenterUtilization.asp on a web server that supports classic ASP pages)

Below are samples of the output as the display as of time advanced – the zoom percent was set to 50.  Notice that the work centers go online and offline as the time progresses (click a picture to see a larger version of that picture):

The final example demonstrates how the display changed when the zoom percent was changed from 50% to 130%:

-

As the mouse pointer is moved over the boxes representing the work centers, a pop-up tooltip appears that describes the work center, as well as employee ID, clock in time, and clock out time for the first labor ticket at that work center in the time period.

——-

Hopefully, you have found this example to be interesting.  Your assignment is to now connect proximity switches to the devices in your office and surrounding areas, recording their location in the RESOURCE_LOCATION_DEMO table.  Then log the proximity switch status to the LABOR_TICKET_DEMO table so that you are able to determine when the water cooler, coffee machine, chairs, keyboards, and computer mice are in use.  Use the data to determine which employees are the hardest working, and which employees have determined how to think smarter rather than working harder.  :-)





Create an Auto-Scaling HTML Chart using Only SQL

8 07 2010

July 8, 2010 (Modified July 9, 2010)

I thought that I would try something a little different today – build an auto-scaling HTML bar chart using nothing more than a SQL statement.  I mentioned in this book review that I was impressed with the HTML chart that was included in the book, but I felt that it might be more interesting if the example used absolute positioning, rather than an HTML table.  So, I built an example using dynamic positioning that is not based on what appears in that book.

We will use the sample table from this blog article (side note: this is an interesting article that shows how a VBS script can generate a 10046 trace file, and then transform that trace file back into a VBS script), just with the table renamed to T1.

CREATE TABLE T1 AS
SELECT
  DECODE(TRUNC(DBMS_RANDOM.VALUE(0,5)),
          0,'MIKE',
          1,'ROB',
          2,'SAM',
          3,'JOE',
          4,'ERIC') EMPLOYEE_ID,
  TRUNC(SYSDATE)-ROUND(DBMS_RANDOM.VALUE(0,1000)) SHIFT_DATE,
  DECODE(TRUNC(DBMS_RANDOM.VALUE(0,10)),
          0,'VAC',
          1,'HOL',
          2,'BEREAVE',
          3,'JURY',
          4,'ABS',
          5,'EXCUSE',
          6,'MIL',
          'OTHER') INDIRECT_ID
FROM
  DUAL
CONNECT BY
  LEVEL<=1000;

COMMIT;

Now that we have 1000 rows in the sample table, let’s see how many entries fall into each week in the table (the week starts on a Monday) for those indirect entries that are either VAC, ABS, or EXCUSE:

SELECT
  NEXT_DAY(SHIFT_DATE,'MONDAY')-7 WEEK_OF,
  COUNT(*) IND
FROM
  T1
WHERE
  INDIRECT_ID IN ('VAC','ABS','EXCUSE')
GROUP BY
  NEXT_DAY(SHIFT_DATE,'MONDAY')-7
ORDER BY
  1;

Your results of course will be different from what follows due to the randomization of the data, but this is what was returned from my database:

WEEK_OF   IND
--------- ---
08-OCT-07   1
15-OCT-07   2
22-OCT-07   4
29-OCT-07   3
05-NOV-07   3
03-DEC-07   2
10-DEC-07   3
24-DEC-07   2
...
05-JAN-09   1
12-JAN-09   3
19-JAN-09   7
02-FEB-09   1
...
21-JUN-10   3
28-JUN-10   2
05-JUL-10   2

The above SQL statement should work for the base query, now we need to start manipulating the data so that we are able to calculate the size and location of the bars in the chart.  We will slide the above SQL statement into an inline view:

SELECT
  WEEK_OF,
  IND,
  MAX(IND) OVER () MAX_IND,
  COUNT(WEEK_OF) OVER () COUNT_WEEK_OF,
  ROWNUM RN
FROM
  (SELECT
    NEXT_DAY(SHIFT_DATE,'MONDAY')-7 WEEK_OF,
    COUNT(*) IND
  FROM
    T1
  WHERE
    INDIRECT_ID IN ('VAC','ABS','EXCUSE')
  GROUP BY
    NEXT_DAY(SHIFT_DATE,'MONDAY')-7
  ORDER BY
    1);

In addition to returning the original data from the SQL statement, we are now also returning the maximum data value, the total number of weeks with at least one entry, and a row counter:

WEEK_OF   IND MAX_IND COUNT_WEEK_OF    RN
--------- --- ------- ------------- -----
08-OCT-07   1       7           126     1
15-OCT-07   2       7           126     2
22-OCT-07   4       7           126     3
29-OCT-07   3       7           126     4
05-NOV-07   3       7           126     5
03-DEC-07   2       7           126     6
10-DEC-07   3       7           126     7
...
05-JAN-09   1       7           126    57
12-JAN-09   3       7           126    58
19-JAN-09   7       7           126    59
02-FEB-09   1       7           126    60
...
14-JUN-10   2       7           126   123
21-JUN-10   3       7           126   124
28-JUN-10   2       7           126   125
05-JUL-10   2       7           126   126

Next, we need to calculate the position and size of each of the bars in the chart, so we will again slide the above into an inline view:

SELECT
  WEEK_OF,
  IND,
  MAX_IND,
  COUNT_WEEK_OF,
  RN,
  TRUNC(300 * IND/MAX_IND) BAR_WIDTH,
  TRUNC(800 * 1/COUNT_WEEK_OF) BAR_HEIGHT,
  TRUNC(800 * 1/COUNT_WEEK_OF * (RN-1)) BAR_TOP,
  100 BAR_LEFT
FROM
  (SELECT
    WEEK_OF,
    IND,
    MAX(IND) OVER () MAX_IND,
    COUNT(WEEK_OF) OVER () COUNT_WEEK_OF,
    ROWNUM RN
  FROM
    (SELECT
      NEXT_DAY(SHIFT_DATE,'MONDAY')-7 WEEK_OF,
      COUNT(*) IND
    FROM
      T1
    WHERE
      INDIRECT_ID IN ('VAC','ABS','EXCUSE')
    GROUP BY
      NEXT_DAY(SHIFT_DATE,'MONDAY')-7
    ORDER BY
      1));

You might notice in the above that I specified that the maximum width of the chart will be 300 (pixels) and the maximum height will be 800 (pixels).  Here is the output:

WEEK_OF   IND MAX_IND COUNT_WEEK_OF    RN  BAR_WIDTH BAR_HEIGHT BAR_TOP BAR_LEFT
--------- --- ------- ------------- ----- ---------- ---------- ------- --------
08-OCT-07   1       7           126     1         42          6       0      100
15-OCT-07   2       7           126     2         85          6       6      100
22-OCT-07   4       7           126     3        171          6      12      100
29-OCT-07   3       7           126     4        128          6      19      100
05-NOV-07   3       7           126     5        128          6      25      100
03-DEC-07   2       7           126     6         85          6      31      100
10-DEC-07   3       7           126     7        128          6      38      100
24-DEC-07   2       7           126     8         85          6      44      100
...
05-JAN-09   1       7           126    57         42          6     355      100
12-JAN-09   3       7           126    58        128          6     361      100
19-JAN-09   7       7           126    59        300          6     368      100
02-FEB-09   1       7           126    60         42          6     374      100
...
14-JUN-10   2       7           126   123         85          6     774      100
21-JUN-10   3       7           126   124        128          6     780      100
28-JUN-10   2       7           126   125         85          6     787      100
05-JUL-10   2       7           126   126         85          6     793      100

Now what?  We need to convert the above into HTML using DIV tags to position the bars as calculated.  Prior to the first row we need to write a couple of HTML tags to set the page title, and after the last row we need to write a couple more HTML tags to close the BODY and HTML section of the document.  The transformed SQL statement looks like this:

SET TRIMSPOOL ON
SET LINESIZE 400
SET ECHO OFF
SET FEEDBACK OFF
SET VERIFY OFF
SET PAGESIZE 0
SET SQLPROMPT ''

SPOOL C:\CUSTOM_CHART.HTM

SELECT
  DECODE(RN,1,'<html><head><title>Custom Chart</title></head><body>' || CHR(13) || CHR(10),' ') ||
  '<div style="position:absolute;' ||
    'top:' || TO_CHAR(TRUNC(800 * 1/COUNT_WEEK_OF * (RN-1))) || 'px;' ||
    'left:' || TO_CHAR(5) || 'px;' ||
    'width:' || TO_CHAR(100) || 'px;' ||
    'height:' || TO_CHAR(TRUNC(800 * 1/COUNT_WEEK_OF)) || 'px;' ||
    '"><font size="1px" color="#0000FF">' || TO_CHAR(WEEK_OF,'MM/DD/YY') ||
      REPLACE('     ',' ',CHR(38) || 'nbsp;') || TO_CHAR(IND) || '</font></div>' ||
  '<div style="background:#444466;position:absolute;' ||
    'top:' || TO_CHAR(TRUNC(800 * 1/COUNT_WEEK_OF * (RN-1))) || 'px;' ||
    'left:' || TO_CHAR(100) || 'px;' ||
    'width:' || TO_CHAR(TRUNC(300 * IND/MAX_IND)) || 'px;' ||
    'height:' || TO_CHAR(TRUNC(800 * 1/COUNT_WEEK_OF)) || 'px;' ||
    '"><font size="1px" color="#FFFFFF"></font></div>' ||
  DECODE(RN,COUNT_WEEK_OF, CHR(13) || CHR(10) || '</body></html>',' ') HTML_LINE
FROM
  (SELECT
    WEEK_OF,
    IND,
    MAX(IND) OVER () MAX_IND,
    COUNT(WEEK_OF) OVER () COUNT_WEEK_OF,
    ROWNUM RN
  FROM
    (SELECT
      NEXT_DAY(SHIFT_DATE,'MONDAY')-7 WEEK_OF,
      COUNT(*) IND
    FROM
      T1
    WHERE
      INDIRECT_ID IN ('VAC','ABS','EXCUSE')
    GROUP BY
      NEXT_DAY(SHIFT_DATE,'MONDAY')-7
    ORDER BY
      1));

SPOOL OFF

There is a slight problem with the above, the SQL statement and SPOOL OFF are printed in the resulting HTML file – if someone knows how to avoid that behavior (without placing the above into another script file), I would like to see how it is done (Oracle’s documentation did not help).

This is what the resulting HTML file looks like:

The number of result rows from the query was a bit high (126) so the bars are significantly compressed in height.  Just to see what happens, let’s add the following to the WHERE clause in the inner-most inline view:

AND SHIFT_DATE >= TO_DATE('01-JAN-2010','DD-MON-YYYY')

The resulting chart now looks like this:

Of course it is possible to adjust the colors of the font (#0000FF) and the bars (#444466), which are specified in hex in the format of RRGGBB (red green blue).  It is also possible to adjust the color of the bars to reflect the value represented by the bar, but that is an exercise for the reader.  For those who need to feel creative, it is also possible to display pictures in the bars, but that is also an exercise left for the reader.

—-

Edit: The sample output from the SQL statement displays correctly on Red Hat Enterprise Linux 3 using Firefox 0.8:





CPU Wait? LAG to the Rescue

26 02 2010

February 26, 2010

A question in a recent OTN thread appeared as follows:

I’m in RAC database (10gR2/Redhat4). I need to store the real CPU wait every 1 minute in a table for a month. For that I thing to query the GV$SYS_TIME_MODEL [for the "DB CPU" statistic].

This is a very difficult question to answer.  Why (additional reference)?  Technically, in Oracle when a session is “on the CPU”, that session is not in a wait event, and therefore not officially waiting for anything. The “DB CPU” statistic captures the accumulated time spent on the CPU by foreground processes – the user sessions. This CPU time does not include the time that the background processes (DBWR, LGWR, PMON, etc.) spend consuming CPU time. Additionally, the “DB CPU” statistic does not consider/accumulate CPU time consumed by processes that are not related to the database instance.  It could also be said that the “DB CPU” time does not account for time that the session spends waiting for its turn to execute on the CPUs.

With the above in mind, let’s see if we are able to calculate the amount of CPU time consumed by the sessions and the background processes in one minute intervals.  First, we need a logging table.  The following SQL statement builds the SYS_TIME_MODEL_CPU table using a couple of the column definitions from the GV$SYS_TIME_MODEL view so that I do not need to explicitly state the column data types (notice that the SQL statement is collapsing data from two source rows into a single row):

CREATE TABLE SYS_TIME_MODEL_CPU AS
SELECT
  SYSDATE CHK_ID,
  INST_ID,
  SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
  SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
FROM
  GV$SYS_TIME_MODEL
WHERE
  0=1
GROUP BY
  INST_ID;

If we are able to find a way to schedule the following SQL statement to execute once a minute, we will be able to store the current values of the “DB CPU” and “background cpu time” statistics with the following SQL statement (note that executing this SQL statement will also consume CPU time, the very thing we are trying to measure):

INSERT INTO SYS_TIME_MODEL_CPU
SELECT
  SYSDATE CHK_ID,
  INST_ID,
  SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
  SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
FROM
  GV$SYS_TIME_MODEL
WHERE
  STAT_NAME IN ('DB CPU','background cpu time')
GROUP BY
  INST_ID;

One way to schedule the SQL statement to execute once a minute is to use the DBMS_LOCK.SLEEP function in a loop.  Unfortunely, on some platforms the function may not wait exactly the specified number of seconds (it may wait slightly longer), and may cause the “PL/SQL lock timer” wait event to steal a position in the top 5 wait events list in a Statspack or AWR report.  For testing purposes, the following anonymous PL/SQL script might be used:

DECLARE
  STime DATE := SYSDATE;
BEGIN
  WHILE (SYSDATE - STime) < 32 LOOP
    INSERT INTO SYS_TIME_MODEL_CPU
      SELECT
        SYSDATE CHK_ID,
        INST_ID,
        SUM(DECODE(STAT_NAME,'DB CPU',VALUE,NULL)) DB_CPU,
        SUM(DECODE(STAT_NAME,'background cpu time',VALUE,NULL)) BACKGROUND_CPU
      FROM
        GV$SYS_TIME_MODEL
      WHERE
        STAT_NAME IN ('DB CPU','background cpu time')
      GROUP BY
        INST_ID;

      COMMIT;
      DBMS_LOCK.SLEEP(60);
  End Loop;
End;
/

If we allow the script to run for a couple of minutes (rather than 31 days), we are able to determine how much CPU time was consumed every minute by using the LAG analytic function, as shown below:

SELECT
  TO_CHAR(CHK_ID,'YYYY-MM-DD HH24:MI') CHK_ID,
  INST_ID,
  DB_CPU-LAG(DB_CPU,1) OVER (PARTITION BY INST_ID ORDER BY CHK_ID) DB_CPU,
  BACKGROUND_CPU-LAG(BACKGROUND_CPU,1) OVER (PARTITION BY INST_ID ORDER BY CHK_ID) BACKGROUND_CPU
FROM
  SYS_TIME_MODEL_CPU
ORDER BY
  CHK_ID;

CHK_ID              INST_ID     DB_CPU BACKGROUND_CPU
---------------- ---------- ---------- --------------
2010-02-24 07:18          1
2010-02-24 07:19          1   59990544          66070
2010-02-24 07:20          1   59951475          66724
2010-02-24 07:21          1   59985268          71768
2010-02-24 07:22          1   60000569          63694
2010-02-24 07:23          1   60002938          71639
2010-02-24 07:24          1   59978651          63770
2010-02-24 07:25          1   61487141          62785
2010-02-24 07:26          1      24194          76990

To determine the number of seconds of CPU time consumed, the values shown in the DB_CPU and BACKGROUND_CPU columns should be divided by 1,000,000.

Why not just use AWR data to obtain this information?  Just because AWR is built-in does not mean that it is free to use the features of AWR (as I attempted to argue in this OTN thread) – this is a fact that is often glossed over by various books, blog articles, “news” articles, and even the Oracle Database documentation when it states that AWR reports are the replacement for Statspack reports.





Excel – Charting the Results of Oracle Analytic Functions

6 02 2010

February 6, 2010

This is a somewhat complicated example that builds a couple of sample tables, uses a SQL statement with the Oracle analytic function LEAD submitted through ADO in an Excel macro, and then presents the information on an Excel worksheet.  When the user clicks one of three buttons on the Excel worksheet, an Excel macro executes that then build charts using disconnected row sources – a disconnected ADO recordset is used to sort the data categories before pushing that data into the charts that are built on the fly.

To start, we need to build the sample tables.  The first two tables follow, a part list table and a vendor list table with random data:

CREATE TABLE PART_LIST (
  PART_ID VARCHAR2(30),
  PRODUCT_CODE VARCHAR2(30),
  COMMODITY_CODE VARCHAR2(30),
  PURCHASED CHAR(1),
  PRIMARY KEY (PART_ID));

INSERT INTO
  PART_LIST
SELECT
  DBMS_RANDOM.STRING('Z',10),
  DBMS_RANDOM.STRING('Z',1),
  DBMS_RANDOM.STRING('Z',1),
  DECODE(ROUND(DBMS_RANDOM.VALUE(1,2)),1,'Y','N')
FROM
  DUAL
CONNECT BY
  LEVEL<=50000;

COMMIT;

CREATE TABLE VENDOR_LIST (
  VENDOR_ID VARCHAR2(30),
  PRIMARY KEY (VENDOR_ID));

INSERT INTO
  VENDOR_LIST
SELECT
  DBMS_RANDOM.STRING('Z',10)
FROM
  DUAL
CONNECT BY
  LEVEL<=100; 

COMMIT;

Next, we need to build a purchase transaction history table, allowing a single part to be purchased from 10 randomly selected vendors of the 100 vendors.  This is actually a Cartesian join, but we need to force it to handled as a nested loop join so that we will have a different set of 10 vendors for each PART_ID:

CREATE TABLE PURCHASE_HISTORY (
  TRANSACTION_ID NUMBER,
  VENDOR_ID VARCHAR2(30),
  PART_ID VARCHAR2(30),
  UNIT_PRICE NUMBER(12,2),
  PURCHASE_DATE DATE,
  PRIMARY KEY (TRANSACTION_ID));

INSERT INTO
  PURCHASE_HISTORY
SELECT /*+ ORDERED USE_NL(PL VL) */
  ROWNUM,
  VL.VENDOR_ID,
  PL.PART_ID,
  VL.UNIT_PRICE,
  VL.PURCHASE_DATE
FROM
  PART_LIST PL,
  (SELECT
     'A' MIN_PART,
     'ZZZZZZZZZZZ' MAX_PART,
     VENDOR_ID,
     UNIT_PRICE,
     PURCHASE_DATE,
     ROWNUM RN
  FROM
    (SELECT
      VENDOR_ID,
      ROUND(DBMS_RANDOM.VALUE(0,10000),2) UNIT_PRICE,
      TRUNC(SYSDATE) - ROUND(DBMS_RANDOM.VALUE(0,5000)) PURCHASE_DATE
    FROM
      VENDOR_LIST
    ORDER BY
      DBMS_RANDOM.VALUE)) VL
WHERE
  PL.PURCHASED='Y'
  AND VL.RN<=10
  AND PL.PART_ID BETWEEN VL.MIN_PART AND VL.MAX_PART;

COMMIT;

Before we start working in Excel, we need to put together a SQL statement so that we are able to determine by how much the price of a part fluctuates over time.  We will use the LEAD analytic function to allow us to compare the current row values with the next row values, and only output the row when either the VENDOR_ID changes or the UNIT_PRICE changes.  While the sample data potentially includes dates up to 5,000 days ago, we only want to consider dates up to 720 days ago for this example:

SELECT /*+ ORDERED */
  PH.PART_ID,
  PH.VENDOR_ID,
  PH.UNIT_PRICE,
  PH.LAST_VENDOR_ID,
  PH.LAST_UNIT_PRICE,
  PL.PRODUCT_CODE,
  PL.COMMODITY_CODE
FROM
  (SELECT
    PH.PART_ID,
    PH.VENDOR_ID,
    PH.UNIT_PRICE,
    PH.PURCHASE_DATE,
    LEAD(PH.VENDOR_ID,1,NULL) OVER (PARTITION BY PART_ID ORDER BY PURCHASE_DATE DESC) LAST_VENDOR_ID,
    LEAD(PH.UNIT_PRICE,1,NULL) OVER (PARTITION BY PART_ID ORDER BY PURCHASE_DATE DESC) LAST_UNIT_PRICE
  FROM
    PURCHASE_HISTORY PH
  WHERE
    PH.PURCHASE_DATE>=TRUNC(SYSDATE-720)) PH,
  PART_LIST PL
WHERE
  PH.PART_ID=PL.PART_ID
  AND (PH.VENDOR_ID<>NVL(PH.LAST_VENDOR_ID,'-')
    OR PH.UNIT_PRICE<>NVL(PH.LAST_UNIT_PRICE,-1))
ORDER BY
  PH.PART_ID,
  PH.PURCHASE_DATE DESC;

The output of the above SQL statement might look something like this:

PART_ID    VENDOR_ID  UNIT_PRICE LAST_VENDO LAST_UNIT_PRICE P C
---------- ---------- ---------- ---------- --------------- - -
AAAFWXDGOR HHJAWQCYIV    1773.67 RPKWXSTFDS         5841.37 I T
AAAFWXDGOR RPKWXSTFDS    5841.37                            I T
AABDVNQJBS BBOSDBKYBR    4034.07                            D J
AABNDOOTTV HQBZXICKQM    2932.36                            C G
AABPRKFTLG NKYJQJXGJN     242.18 HHJAWQCYIV         1997.01 F I
AABPRKFTLG HHJAWQCYIV    1997.01                            F I
AACHFXHCDC SZWNZCRUWZ    3562.43                            P G
AACNAAOZWE JEYKZFIKJU    4290.12                            L N
AAEAYOLWMN DNDYVXUZVZ    4431.63                            K T
AAFLKRJTCO QPXIDOEDTI    8613.52                            Q G
AAGDNYXQGW BZFMNYJVBP     911.06 RPKWXSTFDS         2813.39 B L
AAGDNYXQGW RPKWXSTFDS    2813.39                            B L
AAGMKTQITK RAGVQSBHKW    9221.90 BCIRRDLHAN         8541.34 S W
AAGMKTQITK BCIRRDLHAN    8541.34 CWQNPITMBE         5611.73 S W
AAGMKTQITK CWQNPITMBE    5611.73                            S W
AAINVDSSWC CQXRSIWOIL    2690.31 BBOSDBKYBR         1707.15 K R
AAINVDSSWC BBOSDBKYBR    1707.15 QFPGRYTYUM         9158.98 K R
AAINVDSSWC QFPGRYTYUM    9158.98                            K R
AALCTODILL NKYJQJXGJN    2116.94                            K M
AAMAUJIWLF LPMSAUJGHR    6294.19 CNHZFDEWIH         4666.58 L P
AAMAUJIWLF CNHZFDEWIH    4666.58 SZWNZCRUWZ         2096.59 L P
AAMAUJIWLF SZWNZCRUWZ    2096.59                            L P
AAMYBVKFQC GLVKOCSHSF     265.63 PNGVEEYGKA         5869.67 X Z
AAMYBVKFQC PNGVEEYGKA    5869.67                            X Z
AANVGRNFEX NFHOKCKLDN    3961.42                            Q O
...

Now we need to switch over to Excel.  Create four ActiveX command buttons named cmdInitialize, cmdComparePC, cmdCompareCC, cmdCompareVendorID.  Name the worksheet OracleAnalyticTest, as shown below:

Right-click the OracleAnalyticTest worksheet and select View Code.  See this blog article to determine how to enable macros in Excel 2007 (if you have not already turned on this feature) and add a reference to the Microsoft ActiveX Data Objects 2.8 (or 6.0) Library.  We will also need to add a reference to the Microsoft ActiveX Data Objects Recordset 2.8 (or 6.0) Library.  Next, we add the code to the cmdInitialize button:

Option Explicit 'Forces all variables to be declared

Dim dbDatabase As New ADODB.Connection
Dim strDatabase As String
Dim strUserName As String
Dim strPassword As String

Private Function ConnectDatabase() As Integer
    Dim intResult As Integer

    On Error Resume Next

    If dbDatabase.State <> 1 Then
        'Connection to the database if closed, specify the database name, a username, and password
        strDatabase = "MyDB"
        strUserName = "MyUser"
        strPassword = "MyPassword"

        'Connect to the database
        'Oracle connection string
        dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUserName & ";Password=" & strPassword & ";ChunkSize=1000;FetchSize=100;"

        dbDatabase.ConnectionTimeout = 40
        dbDatabase.CursorLocation = adUseClient
        dbDatabase.Open

        If (dbDatabase.State <> 1) Or (Err <> 0) Then
            intResult = MsgBox("Could not connect to the database.  Check your user name and password." & vbCrLf & Error(Err), 16, "Excel Demo")

            ConnectDatabase = False
        Else
            ConnectDatabase = True
        End If
    Else
        ConnectDatabase = True
    End If
End Function

Private Sub cmdInitialize_Click()
    Dim i As Integer
    Dim intResult As Integer
    Dim lngRow As Long
    Dim strSQL As String
    Dim snpData As ADODB.Recordset

    On Error Resume Next

    Sheets("OracleAnalyticTest").ChartObjects.Delete
    Sheets("OracleAnalyticTest").Rows("4:10000").Delete Shift:=xlUp

    intResult = ConnectDatabase

    If intResult = True Then
        Set snpData = New ADODB.Recordset

        strSQL = "SELECT /*+ ORDERED */" & vbCrLf
        strSQL = strSQL & "  PH.PART_ID," & vbCrLf
        strSQL = strSQL & "  PH.VENDOR_ID," & vbCrLf
        strSQL = strSQL & "  PH.UNIT_PRICE," & vbCrLf
        strSQL = strSQL & "  PH.LAST_VENDOR_ID," & vbCrLf
        strSQL = strSQL & "  PH.LAST_UNIT_PRICE," & vbCrLf
        strSQL = strSQL & "  PL.PRODUCT_CODE," & vbCrLf
        strSQL = strSQL & "  PL.COMMODITY_CODE" & vbCrLf
        strSQL = strSQL & "FROM" & vbCrLf
        strSQL = strSQL & "  (SELECT" & vbCrLf
        strSQL = strSQL & "    PH.PART_ID," & vbCrLf
        strSQL = strSQL & "    PH.VENDOR_ID," & vbCrLf
        strSQL = strSQL & "    PH.UNIT_PRICE," & vbCrLf
        strSQL = strSQL & "    PH.PURCHASE_DATE," & vbCrLf
        strSQL = strSQL & "    LEAD(PH.VENDOR_ID,1,NULL) OVER (PARTITION BY PART_ID ORDER BY PURCHASE_DATE DESC) LAST_VENDOR_ID," & vbCrLf
        strSQL = strSQL & "    LEAD(PH.UNIT_PRICE,1,NULL) OVER (PARTITION BY PART_ID ORDER BY PURCHASE_DATE DESC) LAST_UNIT_PRICE" & vbCrLf
        strSQL = strSQL & "  FROM" & vbCrLf
        strSQL = strSQL & "    PURCHASE_HISTORY PH" & vbCrLf
        strSQL = strSQL & "  WHERE" & vbCrLf
        strSQL = strSQL & "    PH.PURCHASE_DATE>=TRUNC(SYSDATE-270)) PH," & vbCrLf
        strSQL = strSQL & "  PART_LIST PL" & vbCrLf
        strSQL = strSQL & "WHERE" & vbCrLf
        strSQL = strSQL & "  PH.PART_ID=PL.PART_ID" & vbCrLf
        strSQL = strSQL & "  AND (PH.VENDOR_ID<>NVL(PH.LAST_VENDOR_ID,'-')" & vbCrLf
        strSQL = strSQL & "    OR PH.UNIT_PRICE<>NVL(PH.LAST_UNIT_PRICE,-1))" & vbCrLf
        strSQL = strSQL & "ORDER BY" & vbCrLf
        strSQL = strSQL & "  PH.PART_ID," & vbCrLf
        strSQL = strSQL & "  PH.PURCHASE_DATE DESC"
        snpData.Open strSQL, dbDatabase

        If snpData.State = 1 Then
            Application.ScreenUpdating = False

            For i = 0 To snpData.Fields.Count - 1
                ActiveSheet.Cells(3, i + 1).Value = snpData.Fields(i).Name
            Next i
            ActiveSheet.Range(ActiveSheet.Cells(3, 1), ActiveSheet.Cells(3, snpData.Fields.Count)).Font.Bold = True

            ActiveSheet.Range("A4").CopyFromRecordset snpData

            'Auto-fit up to 26 columns
            ActiveSheet.Columns("A:" & Chr(64 + snpData.Fields.Count)).AutoFit
            ActiveSheet.Range("A4").Select
            ActiveWindow.FreezePanes = True

            'Remove duplicate rows with the same PART ID
            lngRow = 4
            Do While lngRow < Sheets("OracleAnalyticTest").UsedRange.Rows.Count + 2
                If Sheets("OracleAnalyticTest").Cells(lngRow, 1).FormulaR1C1 = "" Then
                    'Past the end of the rows
                    Exit Do
                End If
                If Sheets("OracleAnalyticTest").Cells(lngRow - 1, 1).FormulaR1C1 = Sheets("OracleAnalyticTest").Cells(lngRow, 1).FormulaR1C1 Then
                    'Found a duplicate row, delete it
                    Sheets("OracleAnalyticTest").Rows(lngRow).Delete Shift:=xlUp
                Else
                    lngRow = lngRow + 1
                End If
            Loop
            snpData.Close

            Application.ScreenUpdating = True
        End If
    End If

    Set snpData = Nothing
End Sub

The cmdInitialize_Click subroutine retrieves the data from the database using the supplied SQL statement and writes that information to the worksheet.  The macro then eliminates subsequent rows if the part ID is identical to the previous part ID (this step would not have been required if we modified the SQL statement to use the ROW_NUMBER analytic function, and eliminate all rows where the ROW_NUMBER value is not 1).  Once you add the above code, you should be able to switch back to the Excel worksheet, turn off Design Mode, and click the Initialize button.

Unfortunately, this example will retrieve too many rows with too little variation in the PRODUCT_CODE and COMMODITY_CODE columns (just 26 distinct values), so it might be a good idea to delete all rows below row 1004.  Now we need to switch back to the Microsoft Visual Basic editor and add the code for the other three buttons.  Note that this code takes advantage of gradient shading in Excel 2007 charts, so some modification might be necessary on Excel 2003 and earlier.

Private Sub cmdCompareCC_Click()
    Dim i As Long
    Dim intCount As Integer
    Dim intChartNumber As Integer
    Dim lngRows As Long
    Dim dblValues() As Double
    Dim strValueNames() As String
    Dim snpDataList As ADOR.Recordset

    On Error Resume Next

    Sheets("OracleAnalyticTest").ChartObjects.Delete
    Sheets("OracleAnalyticTest").Cells(4, 1).Select
    lngRows = Sheets("OracleAnalyticTest").UsedRange.Rows.Count + 2

    'Set up to use ADOR to automatically sort the product codes
    Set snpDataList = New ADOR.Recordset
    snpDataList.Fields.Append "commodity_code", adVarChar, 30
    snpDataList.Open

    'Pick up a distinct list of commodity codes
    For i = 4 To lngRows
        'Only include those commodity codes with price changes
        If (Sheets("OracleAnalyticTest").Cells(i, 5).Value <> 0) And (Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2) <> 0) Then
            If snpDataList.RecordCount > 0 Then
                snpDataList.MoveFirst
            End If
            snpDataList.Find ("commodity_code = '" & Sheets("OracleAnalyticTest").Cells(i, 7) & "'")
            If snpDataList.EOF Then
                'Did not find a matching record
                snpDataList.AddNew
                snpDataList("commodity_code") = Sheets("OracleAnalyticTest").Cells(i, 7).Value
                snpDataList.Update
            End If
        End If
    Next i
    snpDataList.Sort = "commodity_code"

    'Find the matching rows for each product code
    snpDataList.MoveFirst
    Do While Not snpDataList.EOF
        intCount = 0
        ReDim dblValues(250)
        ReDim strValueNames(250)
        For i = 4 To lngRows
            If intCount >= 250 Then
                'Excel charts only permit about 250 data points when created with this method
                Exit For
            End If
            If Sheets("OracleAnalyticTest").Cells(i, 7).Value = snpDataList("commodity_code") Then
                'Found a row with this product code
                If (Sheets("OracleAnalyticTest").Cells(i, 5).Value <> 0) And (Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2) <> 0) Then
                    'Price change was found
                    dblValues(intCount) = Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2)
                    strValueNames(intCount) = Sheets("OracleAnalyticTest").Cells(i, 1).FormulaR1C1
                    intCount = intCount + 1
                End If
            End If
        Next i

        'Set the arrays to the exact number of elements, first element at position 0
        ReDim Preserve dblValues(intCount - 1)
        ReDim Preserve strValueNames(intCount - 1)

        intChartNumber = intChartNumber + 1
        With Sheets("OracleAnalyticTest").ChartObjects.Add(10 * intChartNumber, 60 + 10 * intChartNumber, 400, 300)
            .Chart.SeriesCollection.NewSeries
            .Chart.SeriesCollection(1).Values = dblValues
            .Chart.SeriesCollection(1).XValues = strValueNames
            .Chart.Axes(1).CategoryType = 2
            .Chart.HasLegend = False

            .Chart.HasTitle = True
            .Chart.ChartTitle.Text = "Price Changes by Commodity Code: " & snpDataList("commodity_code")

            .Chart.Axes(xlCategory, xlPrimary).HasTitle = True
            .Chart.Axes(xlCategory, xlPrimary).AxisTitle.Characters.Text = "Part ID"
            .Chart.Axes(xlValue, xlPrimary).HasTitle = True
            .Chart.Axes(xlValue, xlPrimary).AxisTitle.Characters.Text = "Unit Cost Change"

            .Chart.SeriesCollection(1).HasDataLabels = True
            .Chart.SeriesCollection(1).HasLeaderLines = True

            With .Chart.PlotArea.Border
                .ColorIndex = 16
                .Weight = xlThin
                .LineStyle = xlContinuous
            End With

            .Chart.PlotArea.Fill.OneColorGradient Style:=msoGradientHorizontal, Variant:=2, Degree:=0.756847486076142
            .Chart.PlotArea.Fill.ForeColor.SchemeColor = 23
            .Chart.PlotArea.Fill.Visible = True
            With .Chart.PlotArea.Border
                .ColorIndex = 57
                .Weight = xlThin
                .LineStyle = xlContinuous
            End With

            .Chart.SeriesCollection(1).Fill.OneColorGradient Style:=msoGradientVertical, Variant:=4, Degree:=0.2
            .Chart.SeriesCollection(1).Fill.Visible = True
            .Chart.SeriesCollection(1).Fill.ForeColor.SchemeColor = 4

            .Chart.Axes(xlValue).MajorGridlines.Border.ColorIndex = 2
            With .Chart.SeriesCollection(1).DataLabels.Font
                .Name = "Arial"
                .FontStyle = "Regular"
                .Size = 8
                .Color = RGB(255, 255, 255)
            End With
            With .Chart.Axes(xlCategory).TickLabels.Font
                .Name = "Arial"
                .FontStyle = "Regular"
                .Size = 8
                .Color = RGB(255, 255, 255)
            End With
            With .Chart.ChartTitle.Font
                .Name = "Arial"
                .FontStyle = "Bold"
                .Size = 16
                .Color = RGB(0, 0, 255)
            End With
        End With
        snpDataList.MoveNext
    Loop

    Set snpDataList = Nothing
End Sub

Private Sub cmdComparePC_Click()
    Dim i As Long
    Dim intCount As Integer
    Dim intChartNumber As Integer
    Dim lngRows As Long
    Dim dblValues() As Double
    Dim strValueNames() As String
    Dim snpDataList As ADOR.Recordset

    On Error Resume Next

    Sheets("OracleAnalyticTest").ChartObjects.Delete
    Sheets("OracleAnalyticTest").Cells(4, 1).Select
    lngRows = Sheets("OracleAnalyticTest").UsedRange.Rows.Count + 2

    'Set up to use ADOR to automatically sort the product codes
    Set snpDataList = New ADOR.Recordset
    snpDataList.Fields.Append "product_code", adVarChar, 30
    snpDataList.Open

    'Pick up a distinct list of product codes
    For i = 4 To lngRows
        'Only include those product codes with price changes
        If (Sheets("OracleAnalyticTest").Cells(i, 5).Value <> 0) And (Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2) <> 0) Then
            If snpDataList.RecordCount > 0 Then
                snpDataList.MoveFirst
            End If
            snpDataList.Find ("product_code = '" & Sheets("OracleAnalyticTest").Cells(i, 6) & "'")
            If snpDataList.EOF Then
                'Did not find a matching record
                snpDataList.AddNew
                snpDataList("product_code") = Sheets("OracleAnalyticTest").Cells(i, 6).Value
                snpDataList.Update
            End If
        End If
    Next i
    snpDataList.Sort = "product_code"

    'Find the matching rows for each product code
    snpDataList.MoveFirst
    Do While Not snpDataList.EOF
        intCount = 0
        ReDim dblValues(250)
        ReDim strValueNames(250)
        For i = 4 To lngRows
            If intCount >= 250 Then
                'Excel charts only permit about 250 data points when created with this method
                Exit For
            End If
            If Sheets("OracleAnalyticTest").Cells(i, 6).Value = snpDataList("product_code") Then
                'Found a row with this product code
                If (Sheets("OracleAnalyticTest").Cells(i, 5).Value <> 0) And (Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2) <> 0) Then
                    'Price change was found
                    dblValues(intCount) = Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2)
                    strValueNames(intCount) = Sheets("OracleAnalyticTest").Cells(i, 1).FormulaR1C1
                    intCount = intCount + 1
                End If
            End If
        Next i

        'Set the arrays to the exact number of elements, first element at position 0
        ReDim Preserve dblValues(intCount - 1)
        ReDim Preserve strValueNames(intCount - 1)

        intChartNumber = intChartNumber + 1

        With Sheets("OracleAnalyticTest").ChartObjects.Add(10 * intChartNumber, 60 + 10 * intChartNumber, 400, 300)
            .Chart.SeriesCollection.NewSeries
            .Chart.SeriesCollection(1).Values = dblValues
            .Chart.SeriesCollection(1).XValues = strValueNames
            .Chart.Axes(1).CategoryType = 2
            .Chart.HasLegend = False

            .Chart.HasTitle = True
            .Chart.ChartTitle.Text = "Price Changes by Product Code: " & snpDataList("product_code")

            .Chart.Axes(xlCategory, xlPrimary).HasTitle = True
            .Chart.Axes(xlCategory, xlPrimary).AxisTitle.Characters.Text = "Part ID"
            .Chart.Axes(xlValue, xlPrimary).HasTitle = True
            .Chart.Axes(xlValue, xlPrimary).AxisTitle.Characters.Text = "Unit Cost Change"

            .Chart.SeriesCollection(1).HasDataLabels = True
            .Chart.SeriesCollection(1).HasLeaderLines = True

            With .Chart.PlotArea.Border
                .ColorIndex = 16
                .Weight = xlThin
                .LineStyle = xlContinuous
            End With

            .Chart.PlotArea.Fill.OneColorGradient Style:=msoGradientHorizontal, Variant:=2, Degree:=0.756847486076142
            .Chart.PlotArea.Fill.ForeColor.SchemeColor = 23
            .Chart.PlotArea.Fill.Visible = True
            With .Chart.PlotArea.Border
                .ColorIndex = 57
                .Weight = xlThin
                .LineStyle = xlContinuous
            End With

            .Chart.SeriesCollection(1).Fill.OneColorGradient Style:=msoGradientVertical, Variant:=4, Degree:=0.2
            .Chart.SeriesCollection(1).Fill.Visible = True
            .Chart.SeriesCollection(1).Fill.ForeColor.SchemeColor = 5

            .Chart.Axes(xlValue).MajorGridlines.Border.ColorIndex = 2
            With .Chart.SeriesCollection(1).DataLabels.Font
                .Name = "Arial"
                .FontStyle = "Regular"
                .Size = 8
                .Color = RGB(255, 255, 255)
            End With
            With .Chart.Axes(xlCategory).TickLabels.Font
                .Name = "Arial"
                .FontStyle = "Regular"
                .Size = 8
                .Color = RGB(255, 255, 255)
            End With
            With .Chart.ChartTitle.Font
                .Name = "Arial"
                .FontStyle = "Bold"
                .Size = 16
                .Color = RGB(0, 0, 255)
            End With
        End With

        snpDataList.MoveNext
    Loop

    Set snpDataList = Nothing
End Sub

Private Sub cmdCompareVendorID_Click()
    Dim i As Long
    Dim intCount As Integer
    Dim intChartNumber As Integer
    Dim lngRows As Long
    Dim dblValues() As Double
    Dim strValueNames() As String
    Dim snpDataList As ADOR.Recordset

    On Error Resume Next

    Sheets("OracleAnalyticTest").ChartObjects.Delete
    Sheets("OracleAnalyticTest").Cells(4, 1).Select
    lngRows = Sheets("OracleAnalyticTest").UsedRange.Rows.Count + 2

    'Set up to use ADOR to automatically sort the product codes
    Set snpDataList = New ADOR.Recordset
    snpDataList.Fields.Append "vendor_id", adVarChar, 30
    snpDataList.Open

    'Pick up a distinct list of vendor IDs
    For i = 4 To lngRows
        'Only include those vendor IDs with price changes
        If (Sheets("OracleAnalyticTest").Cells(i, 5).Value <> 0) And (Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2) <> 0) Then
            If snpDataList.RecordCount > 0 Then
                snpDataList.MoveFirst
            End If
            snpDataList.Find ("vendor_id = '" & Sheets("OracleAnalyticTest").Cells(i, 2) & "'")
            If snpDataList.EOF Then
                'Did not find a matching record
                snpDataList.AddNew
                snpDataList("vendor_id") = Sheets("OracleAnalyticTest").Cells(i, 2).Value
                snpDataList.Update
            End If
        End If
    Next i
    snpDataList.Sort = "vendor_id"

    'Find the matching rows for each product code
    snpDataList.MoveFirst
    Do While Not snpDataList.EOF
        intCount = 0
        ReDim dblValues(250)
        ReDim strValueNames(250)
        For i = 4 To lngRows
            If intCount >= 250 Then
                'Excel charts only permit about 250 data points when created with this method
                Exit For
            End If
            If Sheets("OracleAnalyticTest").Cells(i, 2).Value = snpDataList("vendor_id") Then
                'Found a row with this product code
                If (Sheets("OracleAnalyticTest").Cells(i, 5).Value <> 0) And (Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2) <> 0) Then
                    'Price change was found
                    dblValues(intCount) = Round(Sheets("OracleAnalyticTest").Cells(i, 3).Value - Sheets("OracleAnalyticTest").Cells(i, 5).Value, 2)
                    strValueNames(intCount) = Sheets("OracleAnalyticTest").Cells(i, 1).FormulaR1C1
                    intCount = intCount + 1
                End If
            End If
        Next i

        'Set the arrays to the exact number of elements, first element at position 0
        ReDim Preserve dblValues(intCount - 1)
        ReDim Preserve strValueNames(intCount - 1)

        intChartNumber = intChartNumber + 1

        With Sheets("OracleAnalyticTest").ChartObjects.Add(10 * intChartNumber, 60 + 10 * intChartNumber, 400, 300)
            .Chart.SeriesCollection.NewSeries
            .Chart.SeriesCollection(1).Values = dblValues
            .Chart.SeriesCollection(1).XValues = strValueNames
            .Chart.Axes(1).CategoryType = 2
            .Chart.HasLegend = False

            .Chart.HasTitle = True
            .Chart.ChartTitle.Text = "Price Changes by Vendor: " & snpDataList("vendor_id")

            .Chart.Axes(xlCategory, xlPrimary).HasTitle = True
            .Chart.Axes(xlCategory, xlPrimary).AxisTitle.Characters.Text = "Part ID"
            .Chart.Axes(xlValue, xlPrimary).HasTitle = True
            .Chart.Axes(xlValue, xlPrimary).AxisTitle.Characters.Text = "Unit Cost Change"

            .Chart.SeriesCollection(1).HasDataLabels = True
            .Chart.SeriesCollection(1).HasLeaderLines = True

            With .Chart.PlotArea.Border
                .ColorIndex = 16
                .Weight = xlThin
                .LineStyle = xlContinuous
            End With

            .Chart.PlotArea.Fill.OneColorGradient Style:=msoGradientHorizontal, Variant:=2, Degree:=0.756847486076142
            .Chart.PlotArea.Fill.ForeColor.SchemeColor = 23
            .Chart.PlotArea.Fill.Visible = True
            With .Chart.PlotArea.Border
                .ColorIndex = 57
                .Weight = xlThin
                .LineStyle = xlContinuous
            End With

            .Chart.SeriesCollection(1).Fill.OneColorGradient Style:=msoGradientVertical, Variant:=4, Degree:=0.2
            .Chart.SeriesCollection(1).Fill.Visible = True
            .Chart.SeriesCollection(1).Fill.ForeColor.SchemeColor = 45

            .Chart.Axes(xlValue).MajorGridlines.Border.ColorIndex = 2
            With .Chart.SeriesCollection(1).DataLabels.Font
                .Name = "Arial"
                .FontStyle = "Regular"
                .Size = 8
                .Color = RGB(255, 255, 255)
            End With
            With .Chart.Axes(xlCategory).TickLabels.Font
                .Name = "Arial"
                .FontStyle = "Regular"
                .Size = 8
                .Color = RGB(255, 255, 255)
            End With
            With .Chart.ChartTitle.Font
                .Name = "Arial"
                .FontStyle = "Bold"
                .Size = 16
                .Color = RGB(0, 0, 255)
            End With
        End With
        snpDataList.MoveNext
    Loop

    Set snpDataList = Nothing
End Sub

If we switch back to the Excel worksheet, the remaining three buttons should now work.  Clicking each button will cause Excel to examine the data in the worksheet to locate all of the unique values for PRODUCT_CODE, COMMODITY_CODE, or VENDOR_ID, and then sort the list in alphabetical order, and build a chart for each of the part IDs that fall into those categories.  The results for my test run of each button looks like the following three pictures.

You can, of course, adapt the code to work with other SQL statements and modify the chart generating code to alter the chart type, colors, and fonts.





SQL – RANK, MAX Analytical Functions, DECODE, SIGN

26 12 2009

December 26, 2009

A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/22b36e3ca18490db

Table Structure:

Table1
symbol  orders  ttime

Requirement: Want to arrange all records, symbolwise, based on orders (asc order).
Among that, if a particular symbol have records in the range TTIME BETWEEN 9300 AND 1530, then I want to extract MIN(TTIME) within that range else MIN(TTIME) of whatever available records.

I want to achieve this using a single query.

Example:

Table1
symbol  orders  ttime
A       2       9300
A       2       9450
A       2       1030
A       2       9451
A       2       1530
A       2       1600
A       2       1700
B       3       9300
B       4       1600
C       3       1600

I want to get all records with maximum orders (in desc order) for each symbol.

Output:
Symbol  Orders  ttime
A       2       9300
A       2       9450
A       2       9451
A       2       1030
A       2       1530
A       2       1600
A       1       9300
A       2       1700
B       4       9300
B       4       1600
C       3       1600

Out of this subset I want to get all records with ttime falling between 9450 to 1530 to appear first in asc. if there is no record within this range then I want to go for normal asc order on ttime.

Ouput:
Symbol  Orders  ttime
A       2       9450
A       2       1030
A       2       1530
A       2       1600
A       2       9300
B       4       9450
B       4       1030
B       4       1600
C       3       1600

Finally I want to extract only first record

Final output:
A       2       9450
B       4       9300
C       3       1600

Are we dealing with time here?  When is 9300 less than 1530?  Why is 1 included in the above?

The set up:

CREATE TABLE T1 (
  SYMBOL CHAR(1),
  ORDERS NUMBER(10),
  TTIME NUMBER(10));

INSERT INTO T1 VALUES('A',2,9300);
INSERT INTO T1 VALUES('A',2,9450);
INSERT INTO T1 VALUES('A',2,10300);
INSERT INTO T1 VALUES('A',2,9451);
INSERT INTO T1 VALUES('A',2,15300);
INSERT INTO T1 VALUES('A',2,16000);
INSERT INTO T1 VALUES('A',2,17000);
INSERT INTO T1 VALUES('B',3,9300);
INSERT INTO T1 VALUES('B',4,16000);
INSERT INTO T1 VALUES('C',3,16000);

First, let’s find the maximum value for ORDERS for each SYMBOL:

SELECT
  SYMBOL,
  MAX(ORDERS) OVER (PARTITION BY SYMBOL) ORDERS,
  TTIME TTIME
FROM
  T1;

SYMBOL ORDERS  TTIME
A           2   9300
A           2   9450
A           2  10300
A           2   9451
A           2  15300
A           2  17000
A           2  16000
B           4  16000
B           4   9300
C           3  16000

You stated that if TTIME is between 9450 and 1530 (should it be 15300?), that it should take priority over other values of TTIME.  The DECODE below determines if TTIME is between 9450 and 15300, if so it assigns a value of 10 to EXTRA_SORT, otherwise it assigns a value of 1 to EXTRA_SORT:

SELECT
  SYMBOL,
  MAX(ORDERS) OVER (PARTITION BY SYMBOL) ORDERS,
  TTIME TTIME,
  DECODE(SIGN(TTIME-9449.999),1,DECODE(SIGN(TTIME-15300.0001),-1,10,1),1) EXTRA_SORT
FROM
  T1;

SYMBOL ORDERS  TTIME  EXTRA_SORT
A           2   9300           1
A           2   9450          10
A           2  10300          10
A           2   9451          10
A           2  15300          10
A           2  17000           1
A           2  16000           1
B           4  16000           1
B           4   9300           1
C           3  16000           1

If we slide the above into an inline view, we can then rank the rows when sorted first on EXTRA_SORT and then on TTIME:

SELECT
  SYMBOL,
  ORDERS,
  TTIME,
  RANK() OVER (PARTITION BY SYMBOL ORDER BY EXTRA_SORT DESC,TTIME) POSITION
FROM
  (SELECT
    SYMBOL,
    MAX(ORDERS) OVER (PARTITION BY SYMBOL) ORDERS,
    TTIME TTIME,
    DECODE(SIGN(TTIME-9449.999),1,DECODE(SIGN(TTIME-15300.0001),-1,10,1),1) EXTRA_SORT
  FROM
    T1);

SYMBOL ORDERS  TTIME  POSITION
A           2   9450         1
A           2   9451         2
A           2  10300         3
A           2  15300         4
A           2   9300         5
A           2  16000         6
A           2  17000         7
B           4   9300         1
B           4  16000         2
C           3  16000         1

We can again slide the above into an inline view and extract only those with a POSITION value of 1:

SELECT
  SYMBOL,
  ORDERS,
  TTIME
FROM
  (SELECT
    SYMBOL,
    ORDERS,
    TTIME,
    RANK() OVER (PARTITION BY SYMBOL ORDER BY EXTRA_SORT DESC,TTIME)
POSITION
  FROM
    (SELECT
      SYMBOL,
      MAX(ORDERS) OVER (PARTITION BY SYMBOL) ORDERS,
      TTIME TTIME,
      DECODE(SIGN(TTIME-9449.999),1,DECODE(SIGN(TTIME-15300.0001),-1,10,1),1) EXTRA_SORT
    FROM
      T1)
  )
WHERE
  POSITION=1;

SYMBOL ORDERS  TTIME
A           2   9450
B           4   9300
C           3  16000




SQL – DENSE_RANK, PERCENT_RANK, and COUNT Analytical Functions

26 12 2009

December 26, 2009

A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/9af4117466316d9a

I have to following problem:

I get X rows from a statement, these are sorted by a certain column, let’s say a numerical value.
Now I want to calculate the average of this numerical value, but the 10% with the lowest and the 10% with the highest value shall not be included in this calculation. So for example, if I get 20 rows, I need the average of the value in rows 3 to 18.

Currently I solved this with a very complicated statement, but I don’t know the built-in Oracle mathematical functions so I hope that there could be a way to do this with a better performance.

Let’s set up a short experiment:

CREATE TABLE T1 (C1 NUMBER(4));

INSERT INTO T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=20;

We now have a table with 20 rows with numbers between 1 and 20.

Assuming that you are running a version of Oracle that supports analytical functions, the following returns the twenty rows with the relative ranking of each row, if the rows are sorted by C1 in descending order:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
  COUNT(C1) OVER (PARTITION BY 1) R
FROM
  T1;

        C1         DR          R
---------- ---------- ----------
        20          1         20
        19          2         20
        18          3         20
        17          4         20
        16          5         20
        15          6         20
        14          7         20
        13          8         20
        12          9         20
        11         10         20
        10         11         20
         9         12         20
         8         13         20
         7         14         20
         6         15         20
         5         16         20
         4         17         20
         3         18         20
         2         19         20
         1         20         20

A slight modification of the above, dividing the value of DENSE_RANK by the value of COUNT, and also including a PERCENT_RANK for comparison:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
  (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
  PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
  T1;

        C1         DR        DRP         PR
---------- ---------- ---------- ----------
        20          1        .05          0
        19          2         .1 .052631579
        18          3        .15 .105263158
        17          4         .2 .157894737
        16          5        .25 .210526316
        15          6         .3 .263157895
        14          7        .35 .315789474
        13          8         .4 .368421053
        12          9        .45 .421052632
        11         10         .5 .473684211
        10         11        .55 .526315789
         9         12         .6 .578947368
         8         13        .65 .631578947
         7         14         .7 .684210526
         6         15        .75 .736842105
         5         16         .8 .789473684
         4         17        .85 .842105263
         3         18         .9 .894736842
         2         19        .95 .947368421
         1         20          1          1

The final cleanup is performed when the above is slid into an inline view, by using a WHERE clause:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
    (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
    PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
  FROM
    T1) T
WHERE
  T.DRP>0.1
  AND T.DRP<=0.9;

         S
———-
       168

A version that uses the PERCENT_RANK value:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
    (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
    PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
  FROM
    T1) T
WHERE
  T.PR BETWEEN 0.1 AND 0.9;

         S
———-
       168





SQL – COUNT Analytical Function, GROUP BY, HAVING

26 12 2009

December 26, 2009

A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group: http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/93bf8d1e75033d4c

I have a book table and in that table it has the book tile, publisher, and type of book it is. example mystery, scifi, etc…

I am trying to write a query that brings back a list of every pair of books that have the same publisher and same book type.  I have been able to get the following code to work:

select publisher_code, type
from book
group by publisher_code, type
having count(*) > 1;

which returns the following results:

PU TYP
-- ---
JP MYS
LB FIC
PE FIC
PL FIC
ST SFI
VB FIC

I can not figure out how to get the book title and book code for the books that this result list represents, everything I have tried throws out an error.

My initial response follows:

I see two possible methods:

  1.  Slide the SQL statement that you have written into an inline view, join the inline view to your book table, and then use the publisher_code, type columns to drive back into your book table.  The join syntax may look like one of the following: (publisher_code, type) IN (SELECT…)   or   b.publisher_code=ib.publisher_code and b.type=ib.type 
  2.  Use analytical functions (COUNT() OVER…) to determine the number of matches for the same publisher_code, type columns.  Then slide this SQL statement into an inline view to retrieve only those records with the aliased COUNT() OVER greater than 1.  This has the benefit of retrieving the matching rows in a single pass.

The original poster then attempted to create a query to meet the requirements, but the query generated an error:

SQL> select title
  2  from book
  3  where publisher_code, type in
  4  (select publisher_code, type
  5  from book
  6  group by publisher_code, type
  7  having count(*) > 1);
where publisher_code, type in
                    *
ERROR at line 3:
ORA-00920: invalid relational operator

My reponse continues:

Very close to what you need.  However, Oracle expects the column names to be wrapped in () … like this:  where (publisher_code, type) in

The above uses a subquery, which may perform slow on some Oracle releases compared to the use of an inline view.  Assume that I have a table named PART, which has columns ID, DESCRIPITION, PRODUCT_CODE, and COMMODITY_CODE, with ID as the primary key.  I want to find ID, DESCRIPTION, and COMMODITY_CODE for all parts with the same DESCRIPTION and PRODUCT_CODE, where there are at least 3 matching parts in the group:

The starting point, which looks similar to your initial query:

SELECT
  DESCRIPTION,
  PRODUCT_CODE,
  COUNT(*) NUM_MATCHES
FROM
  PART
GROUP BY
  DESCRIPTION,
  PRODUCT_CODE
HAVING
  COUNT(*)>=3;

When the original query is slid into an inline view and joined to the original table, it looks like this:

SELECT
  P.ID,
  P.DESCRIPTION,
  P.COMMODITY_CODE
FROM
  (SELECT
    DESCRIPTION,
    PRODUCT_CODE,
    COUNT(*) NUM_MATCHES
  FROM
    PART
  GROUP BY
    DESCRIPTION,
    PRODUCT_CODE
  HAVING
    COUNT(*)>=3) IP,
  PART P
WHERE
  IP.DESCRIPTION=P.DESCRIPTION
  AND IP.PRODUCT_CODE=P.PRODUCT_CODE;

Here is the DBMS_XPLAN:

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------
|*  1 |  HASH JOIN            |      |      1 |   1768 |  11525 |00:00:00.21 |    2748 |  1048K|  1048K| 1293K (0)|
|   2 |   VIEW                |      |      1 |   1768 |   1156 |00:00:00.11 |    1319 |       |       |          |
|*  3 |    FILTER             |      |      1 |        |   1156 |00:00:00.11 |    1319 |       |       |          |
|   4 |     HASH GROUP BY     |      |      1 |   1768 |  23276 |00:00:00.08 |    1319 |       |       |          |
|   5 |      TABLE ACCESS FULL| PART |      1 |  35344 |  35344 |00:00:00.04 |    1319 |       |       |          |
|   6 |   TABLE ACCESS FULL   | PART |      1 |  35344 |  35344 |00:00:00.04 |    1429 |       |       |          |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("IP"."DESCRIPTION"="P"."DESCRIPTION" AND "IP"."PRODUCT_CODE"="P"."PRODUCT_CODE")
   3 - filter(COUNT(*)>=3)

The query format using the subquery looks like this:

SELECT
  P.ID,
  P.DESCRIPTION,
  P.COMMODITY_CODE
FROM
  PART P
WHERE
  (DESCRIPTION,PRODUCT_CODE) IN
  (SELECT
    DESCRIPTION,
    PRODUCT_CODE
  FROM
    PART
  GROUP BY
    DESCRIPTION,
    PRODUCT_CODE
  HAVING
    COUNT(*)>=3);

The DBMS_XPLAN, note that Oracle 10.2.0.2 transformed the query above:

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
|*  1 |  HASH JOIN RIGHT SEMI |          |      1 |      1 |  11525 |00:00:00.21 |    2748 |  1048K|  1048K| 1214K (0)|
|   2 |   VIEW                | VW_NSO_1 |      1 |   1768 |   1156 |00:00:00.12 |    1319 |       |       |          |
|*  3 |    FILTER             |          |      1 |        |   1156 |00:00:00.12 |    1319 |       |       |          |
|   4 |     HASH GROUP BY     |          |      1 |   1768 |  23276 |00:00:00.09 |    1319 |       |       |          |
|   5 |      TABLE ACCESS FULL| PART     |      1 |  35344 |  35344 |00:00:00.04 |    1319 |       |       |          |
|   6 |   TABLE ACCESS FULL   | PART     |      1 |  35344 |  35344 |00:00:00.01 |    1429 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("DESCRIPTION"="$nso_col_1" AND "PRODUCT_CODE"="$nso_col_2")
   3 - filter(COUNT(*)>=3)

Without allowing the automatic transformations in Oracle 10.2.0.2, the query takes _much_ longer than 0.21 seconds to complete.

The method using analytical functions starts like this:

SELECT
  P.ID,
  P.DESCRIPTION,
  P.COMMODITY_CODE,
  COUNT(*) OVER (PARTITION BY DESCRIPTION, PRODUCT_CODE) NUM_MATCHES
FROM
  PART P;

Then, sliding the above into an inline view:

SELECT
  ID,
  DESCRIPTION,
  COMMODITY_CODE
FROM
  (SELECT
    P.ID,
    P.DESCRIPTION,
    P.COMMODITY_CODE,
    COUNT(*) OVER (PARTITION BY DESCRIPTION, PRODUCT_CODE) NUM_MATCHES
  FROM
    PART P)
WHERE
  NUM_MATCHES>=3;

The DBMS_XPLAN for the above looks like this:

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|*  1 |  VIEW               |      |      1 |  35344 |  11525 |00:00:00.31 |    1319 |       |       |          |
|   2 |   WINDOW SORT       |      |      1 |  35344 |  35344 |00:00:00.27 |    1319 |  2533K|   726K| 2251K (0)|
|   3 |    TABLE ACCESS FULL| PART |      1 |  35344 |  35344 |00:00:00.04 |    1319 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("NUM_MATCHES">=3)

Note that there is only one TABLE ACCESS FULL of the PART table in the above.  The execution time required 0.31 seconds to complete, which is greater than the first two approaches, but that is because the database server is concurrently still trying to resolve the query method using the subquery with no permitted transformations (5+ minutes later).

Subquery method with no transformations permitted:

---------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|*  1 |  FILTER              |      |      1 |        |  11525 |00:46:21.46 |      38M|
|   2 |   TABLE ACCESS FULL  | PART |      1 |  35344 |  35344 |00:00:00.25 |    1429 |
|*  3 |   FILTER             |      |  29474 |        |   6143 |00:46:06.52 |      38M|
|   4 |    HASH GROUP BY     |      |  29474 |      1 |    613M|00:33:24.30 |      38M|
|   5 |     TABLE ACCESS FULL| PART |  29474 |  35344 |   1041M|00:00:02.54 |      38M|
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)
   3 - filter(("DESCRIPTION"=:B1 AND "PRODUCT_CODE"=:B2 AND COUNT(*)>=3))

Maxim Demenko provided another possible solution for the problem experienced by the original poster.





Update Rows in Another Table with the Help of Analytic Functions

17 12 2009

December 17, 2009

In a recent comp.databases.oracle.misc Usenet thread:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/42c0d1550278250b

The following question was asked:

I need help with a query which involves the 2 tables defined below.  What I need to do is choose the record with the max “Eff Date” from “Table A” for a particular “Emp No.” and update the “Desc” from that record in the field “Desc” of “Table B” for the same “Emp No.”. I am able to choose the max “Eff Date” record for each employee from Table A but somehow not able to updated the same “Desc” in “Table B”.

Request you to please help the query. Any help would be appreciated.
Thanks!

Table A
Emp No. Group   Eff Date        Desc
1234    CI      01/01/1989      X
1234    CI      01/02/2000      X
1234    CI      01/02/2006      A
2345    AF      01/01/1990      X
2345    AF      01/02/2005      A

 

Table B
Emp No. Group   Desc
1234    CI      X
2345    AF      A
3456    CI      A

I provided the following suggestion:

Watch the query and results closely as one possible solution is built (there are other methods):

CREATE TABLE T1 (
  EMP_NO NUMBER,
  GROUPING VARCHAR2(5),
  EFF_DATE DATE,
  DESCR VARCHAR2(5));

CREATE TABLE T2 (
  EMP_NO NUMBER,
  GROUPING VARCHAR2(5),
  DESCR VARCHAR2(5));

INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/01/1989','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2000','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2006','MM/DD/YYYY'),'A');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/01/1990','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/02/2005','MM/DD/YYYY'),'A');

INSERT INTO T2 VALUES (1234,'CI','XNN');
INSERT INTO T2 VALUES (2345,'AF','ANN');
INSERT INTO T2 VALUES (3456,'CI','ANN');

COMMIT;

 

SELECT
  EMP_NO,
  GROUPING,
  DESCR
FROM
  T2;

EMP_NO GROUP DESCR
------ ----- -----
  1234 CI    XNN
  2345 AF    ANN
  3456 CI    ANN

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  DESCR
FROM
  T1;

EMP_NO GROUP EFF_DATE  DESCR
------ ----- --------- -----
  1234 CI    01-JAN-89 X
  1234 CI    02-JAN-00 X
  1234 CI    02-JAN-06 A
  2345 AF    01-JAN-90 X
  2345 AF    02-JAN-05 A

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE
DESC) RN,
  DESCR
FROM
  T1;

EMP_NO GROUP EFF_DATE          RN DESCR
------ ----- --------- ---------- -----
  1234 CI    02-JAN-06          1 A
  1234 CI    02-JAN-00          2 X
  1234 CI    01-JAN-89          3 X
  2345 AF    02-JAN-05          1 A
  2345 AF    01-JAN-90          2 X

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  DESCR
FROM
  (SELECT
    EMP_NO,
    GROUPING,
    EFF_DATE,
    ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
    DESCR
  FROM
    T1)
WHERE
  RN=1;

EMP_NO GROUP EFF_DATE  DESCR
------ ----- --------- -----
  1234 CI    02-JAN-06 A
  2345 AF    02-JAN-05 A

 

UPDATE
  T2
SET
  DESCR=(
    SELECT
      DESCR
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1) T1
    WHERE
      RN=1
      AND T1.EMP_NO=T2.EMP_NO
      AND T1.GROUPING=T2.GROUPING)
WHERE
  (T2.EMP_NO,T2.GROUPING) IN (
    SELECT
      EMP_NO,
      GROUPING
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1)
    WHERE
      RN=1);

2 rows updated.

 

SELECT
  EMP_NO,
  GROUPING,
  DESCR
FROM
  T2;

EMP_NO GROUP DESCR
------ ----- -----
  1234 CI    A
  2345 AF    A
  3456 CI    ANN

 

Note that in the above, I assumed that the combination of EMP_NO and GROUPING had to be the same.

~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko provided a very different approach to solving the problem that is both compact and impressive:

SQL> merge into t2 t2
   2  using (
   3    select emp_no,grouping,
   4      max(descr) keep(dense_rank last order by eff_date) descr
   5      from t1 group by emp_no,grouping) t1
   6  on (t1.emp_no=t2.emp_no
   7  and t1.grouping=t2.grouping)
   8  when matched then update set t2.descr=t1.descr
   9  ;

2 rows merged.

 

Execution Plan
----------------------------------------------------------
Plan hash value: 3235844370

-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | MERGE STATEMENT        |      |     2 |    16 |     8  (25)| 00:00:01 |
|   1 |  MERGE                 | T2   |       |       |            |          |
|   2 |   VIEW                 |      |       |       |            |          |
|*  3 |    HASH JOIN           |      |     2 |   108 |     8  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL  | T2   |     3 |    99 |     3   (0)| 00:00:01 |
|   5 |     VIEW               |      |     5 |   105 |     4  (25)| 00:00:01 |
|   6 |      SORT GROUP BY     |      |     5 |   150 |     4  (25)| 00:00:01 |
|   7 |       TABLE ACCESS FULL| T1   |     5 |   150 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------­---

Predicate Information (identified by operation id):
---------------------------------------------------
    3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND
               "T1"."GROUPING"="T2"."GROUPING")

~~~~~~~~~~~~~~~~~~~~~~~

Since Maxim provided an execution plan, let’s compare the efficiency of the two methods with a larger test case that uses the same table definitions:

TRUNCATE TABLE T1;
TRUNCATE TABLE T2;

INSERT INTO T1
SELECT
  DECODE(MOD(ROWNUM,10),
         0,0000,
         1,1111,
         2,2222,
         3,3333,
         4,4444,
         5,5555,
         6,6666,
         7,7777,
         8,8888,
         9,9999),
  DECODE(MOD(ROWNUM,6),
         0,'AA',
         1,'BB',
         2,'CC',
         3,'DD',
         4,'EE',
         5,'FF'),
  TRUNC(SYSDATE+SIN(ROWNUM/180*3.141592)*1000),
  UPPER(DBMS_RANDOM.STRING('A',1))
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

INSERT INTO T2
SELECT
  DECODE(MOD(ROWNUM,10),
         0,0000,
         1,1111,
         2,2222,
         3,3333,
         4,4444,
         5,5555,
         6,6,
         7,7,
         8,8,
         9,9),
  DECODE(MOD(ROWNUM,11),
         0,'AA',
         1,'BB',
         2,'CC',
         3,'DD',
         4,'EE',
         5,'FF',
         6,'GG',
         7,'HH',
         8,'II',
         9,'JJ',
         10,'KK'),
  UPPER(DBMS_RANDOM.STRING('A',3))
FROM
  DUAL
CONNECT BY
  LEVEL<=100;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)

 

SET PAGESIZE 1000
SET LINESIZE 150
ALTER SESSION SET STATISTICS_LEVEL='ALL';
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SPOOL C:\CHECKTIMING.TXT

UPDATE
  T2
SET
  DESCR=(
    SELECT
      DESCR
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1) T1
    WHERE
      RN=1
      AND T1.EMP_NO=T2.EMP_NO
      AND T1.GROUPING=T2.GROUPING)
WHERE
  (T2.EMP_NO,T2.GROUPING) IN (
    SELECT
      EMP_NO,
      GROUPING
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1)
    WHERE
      RN=1);

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SELECT
  *
FROM
  T2
WHERE
  LENGTH(DESCR)=1
ORDER BY
  EMP_NO;

ROLLBACK;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

merge into t2 t2
 using (
   select emp_no,grouping,
     max(descr) keep(dense_rank last order by eff_date) descr
     from t1 group by emp_no,grouping) t1
 on (t1.emp_no=t2.emp_no
 and t1.grouping=t2.grouping)
 when matched then update set t2.descr=t1.descr
 ;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SELECT
  *
FROM
  T2
WHERE
  LENGTH(DESCR)=1
ORDER BY
  EMP_NO;

ROLLBACK;
SPOOL OFF

What is the output of the above?

SQL_ID  8w16pv37zxuh5, child number 0
-------------------------------------
UPDATE   T2 SET   DESCR=(     SELECT       DESCR     FROM       (SELECT
        EMP_NO,         GROUPING,         ROW_NUMBER() OVER (PARTITION
BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,         DESCR
FROM         T1) T1     WHERE       RN=1       AND T1.EMP_NO=T2.EMP_NO
     AND T1.GROUPING=T2.GROUPING) WHERE   (T2.EMP_NO,T2.GROUPING) IN (
   SELECT       EMP_NO,       GROUPING     FROM       (SELECT
EMP_NO,         GROUPING,         ROW_NUMBER() OVER (PARTITION BY
EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,         DESCR       FROM
      T1)     WHERE       RN=1) 

Plan hash value: 2277482977                                                                                                                          

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT            |          |      1 |        |      0 |00:00:06.12 |   59000 |   3282 |      5 |       |       |          |
|   1 |  UPDATE                     | T2       |      1 |        |      0 |00:00:06.12 |   59000 |   3282 |      5 |       |       |          |
|*  2 |   HASH JOIN SEMI            |          |      1 |      1 |     17 |00:00:03.06 |    3289 |   3280 |      5 |   816K|   816K| 1167K (0)|
|   3 |    TABLE ACCESS FULL        | T2       |      1 |    100 |    100 |00:00:00.03 |       7 |      6 |      0 |       |       |          |
|   4 |    VIEW                     | VW_NSO_1 |      1 |   1000K|     30 |00:00:03.03 |    3282 |   3274 |      5 |       |       |          |
|*  5 |     VIEW                    |          |      1 |   1000K|     30 |00:00:03.03 |    3282 |   3274 |      5 |       |       |          |
|*  6 |      WINDOW SORT PUSHED RANK|          |      1 |   1000K|     67 |00:00:03.03 |    3282 |   3274 |      5 | 36864 | 36864 |   25M (1)|
|   7 |       TABLE ACCESS FULL     | T1       |      1 |   1000K|   1000K|00:00:00.01 |    3275 |   3269 |      0 |       |       |          |
|*  8 |   VIEW                      |          |     17 |  16667 |     17 |00:00:03.06 |   55675 |      0 |      0 |       |       |          |
|*  9 |    WINDOW SORT PUSHED RANK  |          |     17 |  16667 |    566K|00:00:02.53 |   55675 |      0 |      0 |  1541K|   615K| 1369K (0)|
|* 10 |     TABLE ACCESS FULL       | T1       |     17 |  16667 |    566K|00:00:01.13 |   55675 |      0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):                                                                                                  
---------------------------------------------------                                                                                                  
   2 - access("T2"."EMP_NO"="EMP_NO" AND "T2"."GROUPING"="GROUPING")
   5 - filter("RN"=1)
   6 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
   8 - filter("RN"=1)
   9 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
  10 - filter(("EMP_NO"=:B1 AND "GROUPING"=:B2))

    EMP_NO GROUP DESCR
---------- ----- -----
         0 CC    W   
         0 EE    F   
      1111 BB    F   
      1111 DD    U   
      1111 FF    Y   
      2222 AA    W   
      2222 CC    T   
      2222 EE    K   
      3333 BB    Z   
      3333 FF    I   
      3333 DD    W   
      4444 AA    Z   
      4444 EE    G   
      4444 CC    S   
      5555 DD    L   
      5555 BB    Y   
      5555 FF    X   

 

SQL_ID  93cj2ck69n4kg, child number 0
-------------------------------------
merge into t2 t2  using (    select emp_no,grouping,      max(descr)
keep(dense_rank last order by eff_date) descr      from t1 group by
emp_no,grouping) t1  on (t1.emp_no=t2.emp_no  and
t1.grouping=t2.grouping)  when matched then update set t2.descr=t1.descr

Plan hash value: 4231777338                                                                                                                          

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT        |      |      1 |        |      1 |00:00:01.53 |    3301 |   3277 |       |       |          |
|   1 |  MERGE                 | T2   |      1 |        |      1 |00:00:01.53 |    3301 |   3277 |       |       |          |
|   2 |   VIEW                 |      |      1 |        |     17 |00:00:01.50 |    3282 |   3275 |       |       |          |
|*  3 |    HASH JOIN           |      |      1 |     43 |     17 |00:00:01.50 |    3282 |   3275 |   921K|   921K| 1181K (0)|
|   4 |     VIEW               |      |      1 |     43 |     30 |00:00:01.47 |    3275 |   3269 |       |       |          |
|   5 |      SORT GROUP BY     |      |      1 |     43 |     30 |00:00:01.47 |    3275 |   3269 | 73728 | 73728 |          |
|   6 |       TABLE ACCESS FULL| T1   |      1 |   1000K|   1000K|00:00:00.01 |    3275 |   3269 |       |       |          |
|   7 |     TABLE ACCESS FULL  | T2   |      1 |    100 |    100 |00:00:00.03 |       7 |      6 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):                                                                                                  
---------------------------------------------------                                                                                                  
   3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND "T1"."GROUPING"="T2"."GROUPING") 

    EMP_NO GROUP DESCR
---------- ----- -----
         0 CC    Z   
         0 EE    Z   
      1111 BB    Z   
      1111 DD    X   
      1111 FF    Z   
      2222 AA    Z   
      2222 CC    Z   
      2222 EE    Z   
      3333 BB    Z   
      3333 FF    Z   
      3333 DD    Z   
      4444 AA    Z   
      4444 EE    Z   
      4444 CC    Z   
      5555 DD    Z   
      5555 BB    Z   
      5555 FF    Z   

From the point of view of performance, Maxim’s solution is a clear winner.  It is interesting to note that the value of the DESCR column in table T2 differs for the two approaches.





SQL – Bad Execution Plan caused by ROWNUM, ROW_NUMBER is Possible Fix

9 12 2009

December 9, 2009

A recent discussion in the OTN forums demonstrated a cardinality calculation problem caused by using ROWNUM as a predicate in the WHERE clause (this specific issue was corrected in 11.2.0.1):
http://forums.oracle.com/forums/thread.jspa?threadID=934895&start=0

The person who started the OTN thread obviously knew how to ask a question, provide relevant information, and analyze the problem (essentially a model of a good request for assistance):

I’m having a couple of issues with a query, and I can’t figure out the best way to reach a solution.

Platform Information
Windows Server 2003 R2
Oracle 10.2.0.4

Optimizer Settings

SQL > show parameter optimizer

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     2
optimizer_features_enable            string      10.2.0.4
optimizer_index_caching              integer     90
optimizer_index_cost_adj             integer     30
optimizer_mode                       string      ALL_ROWS
optimizer_secure_view_merging        boolean     TRUE

The query below, is a simple “Top N” query, where the top result is returned. Here it is, with bind variables in the same location as the application code:

SELECT PRODUCT_DESC
FROM
(
 SELECT PRODUCT_DESC
 , COUNT(*) AS CNT
 FROM USER_VISITS 
 JOIN PRODUCT ON PRODUCT.PRODUCT_OID = USER_VISITS.PRODUCT_OID
 WHERE PRODUCT.PRODUCT_DESC != 'Home' 
 AND VISIT_DATE
  BETWEEN
   ADD_MONTHS    
   (
    TRUNC    
    (
     TO_DATE   
     (
      :vCurrentYear
     , 'YYYY'
     )
    , 'YEAR'
    )
   , 3*(:vCurrentQuarter-1)
   )
  AND
   ADD_MONTHS    
   (
    TRUNC    
    (
     TO_DATE   
     (
      :vCurrentYear
     , 'YYYY'
     )
    , 'YEAR'
    )
   , 3*:vCurrentQuarter
   ) - INTERVAL '1' DAY   
 GROUP BY PRODUCT_DESC
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1;

The explain plan I receive when running the query above.

| Id  | Operation                         | Name                          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                               |      1 |        |      1 |00:00:34.92 |   66343 |       |       |          |
|   2 |   VIEW                            |                               |      1 |      1 |      1 |00:00:34.92 |   66343 |       |       |          |
|*  3 |    FILTER                         |                               |      1 |        |      1 |00:00:34.92 |   66343 |       |       |          |
|   4 |     SORT ORDER BY                 |                               |      1 |      1 |      1 |00:00:34.92 |   66343 |  2048 |  2048 | 2048  (0)|
|   5 |      SORT GROUP BY NOSORT         |                               |      1 |      1 |     27 |00:00:34.92 |   66343 |       |       |          |
|   6 |       NESTED LOOPS                |                               |      1 |      2 |  12711 |00:00:34.90 |   66343 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| PRODUCT                       |      1 |     74 |     77 |00:00:00.01 |      44 |       |       |          |
|*  8 |         INDEX FULL SCAN           | PRODUCT_PRODDESCHAND_UNQ      |      1 |      1 |     77 |00:00:00.01 |       1 |       |       |          |
|*  9 |        INDEX FULL SCAN            | USER_VISITS#PK                |     77 |      2 |  12711 |00:00:34.88 |   66299 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=1)
   3 - filter(ADD_MONTHS(TRUNC(TO_DATE(TO_CHAR(:VCURRENTYEAR),'YYYY'),'fmyear'),3*(:VCURRENTQUARTER-1))<=ADD_MONTHS(TRUNC(TO_DATE(TO_CHAR(:VCURR
              ENTYEAR),'YYYY'),'fmyear'),3*:VCURRENTQUARTER)-INTERVAL'+01 00:00:00' DAY(2) TO SECOND(0))
   8 - filter("PRODUCT"."PRODUCT_DESC"<>'Home')
   9 - access("USER_VISITS"."VISIT_DATE">=ADD_MONTHS(TRUNC(TO_DATE(TO_CHAR(:VCURRENTYEAR),'YYYY'),'fmyear'),3*(:VCURRENTQUARTER-1)) AND
              "USER_VISITS"."PRODUCT_OID"="PRODUCT"."PRODUCT_OID" AND "USER_VISITS"."VISIT_DATE"<=ADD_MONTHS(TRUNC(TO_DATE(TO_CHAR(:VCURRENTYEAR),'YYYY')
              ,'fmyear'),3*:VCURRENTQUARTER)-INTERVAL'+01 00:00:00' DAY(2) TO SECOND(0))
       filter(("USER_VISITS"."VISIT_DATE">=ADD_MONTHS(TRUNC(TO_DATE(TO_CHAR(:VCURRENTYEAR),'YYYY'),'fmyear'),3*(:VCURRENTQUARTER-1)) AND
              "USER_VISITS"."VISIT_DATE"<=ADD_MONTHS(TRUNC(TO_DATE(TO_CHAR(:VCURRENTYEAR),'YYYY'),'fmyear'),3*:VCURRENTQUARTER)-INTERVAL'+01 00:00:00' DAY(2)
              TO SECOND(0) AND "USER_VISITS"."PRODUCT_OID"="PRODUCT"."PRODUCT_OID"))

TKPROF Row Source Generation

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.01          0          0          0           0
Fetch        2     35.10      35.13          0      66343          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     35.10      35.14          0      66343          0           1

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 62 

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  COUNT STOPKEY (cr=66343 pr=0 pw=0 time=35132008 us)
      1   VIEW  (cr=66343 pr=0 pw=0 time=35131996 us)
      1    FILTER  (cr=66343 pr=0 pw=0 time=35131991 us)
      1     SORT ORDER BY (cr=66343 pr=0 pw=0 time=35131936 us)
     27      SORT GROUP BY NOSORT (cr=66343 pr=0 pw=0 time=14476309 us)
  12711       NESTED LOOPS  (cr=66343 pr=0 pw=0 time=22921810 us)
     77        TABLE ACCESS BY INDEX ROWID PRODUCT (cr=44 pr=0 pw=0 time=3674 us)
     77         INDEX FULL SCAN PRODUCT_PRODDESCHAND_UNQ (cr=1 pr=0 pw=0 time=827 us)(object id 52355)
  12711        INDEX FULL SCAN USER_VISITS#PK (cr=66299 pr=0 pw=0 time=44083746 us)(object id 52949)

However when I run the query with an ALL_ROWS hint I receive this explain plan (reasoning for this can be found here Jonathan’s Lewis’ response: http://www.freelists.org/post/oracle-l/ORDER-BY-and-first-rows-10-madness,4):

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                |     1 |    39 |   223  (25)| 00:00:03 |
|*  1 |  COUNT STOPKEY             |                |       |       |            |          |
|   2 |   VIEW                     |                |     1 |    39 |   223  (25)| 00:00:03 |
|*  3 |    FILTER                  |                |       |       |            |          |
|   4 |     SORT ORDER BY          |                |     1 |    49 |   223  (25)| 00:00:03 |
|   5 |      HASH GROUP BY         |                |     1 |    49 |   223  (25)| 00:00:03 |
|*  6 |       HASH JOIN            |                |   490 | 24010 |   222  (24)| 00:00:03 |
|*  7 |        TABLE ACCESS FULL   | PRODUCT        |    77 |  2849 |     2   (0)| 00:00:01 |
|*  8 |        INDEX FAST FULL SCAN| USER_VISITS#PK |   490 |  5880 |   219  (24)| 00:00:03 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=1)
   3 - filter(ADD_MONTHS(TRUNC(TO_DATE(:VCURRENTYEAR,'YYYY'),'fmyear'),3*(TO_NUMBER(:
              VCURRENTQUARTER)-1))<=ADD_MONTHS(TRUNC(TO_DATE(:VCURRENTYEAR,'YYYY'),'fmyear'),3*TO_N
              UMBER(:VCURRENTQUARTER))-INTERVAL'+01 00:00:00' DAY(2) TO SECOND(0))
   6 - access("USER_VISITS"."PRODUCT_OID"="PRODUCT"."PRODUCT_OID")
   7 - filter("PRODUCT"."PRODUCT_DESC"<>'Home')
   8 - filter("USER_VISITS"."VISIT_DATE">=ADD_MONTHS(TRUNC(TO_DATE(:VCURRENTYEAR,'YYY
              Y'),'fmyear'),3*(TO_NUMBER(:VCURRENTQUARTER)-1)) AND
              "USER_VISITS"."VISIT_DATE"<=ADD_MONTHS(TRUNC(TO_DATE(:VCURRENTYEAR,'YYYY'),'fmyear'),
              3*TO_NUMBER(:VCURRENTQUARTER))-INTERVAL'+01 00:00:00' DAY(2) TO SECOND(0))

And the TKPROF Row Source Generation:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        3      0.51       0.51          0        907          0          27
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.51       0.51          0        907          0          27

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 62 

Rows     Row Source Operation
-------  ---------------------------------------------------
     27  FILTER  (cr=907 pr=0 pw=0 time=513472 us)
     27   SORT ORDER BY (cr=907 pr=0 pw=0 time=513414 us)
     27    HASH GROUP BY (cr=907 pr=0 pw=0 time=512919 us)
  12711     HASH JOIN  (cr=907 pr=0 pw=0 time=641130 us)
     77      TABLE ACCESS FULL PRODUCT (cr=5 pr=0 pw=0 time=249 us)
  22844      INDEX FAST FULL SCAN USER_VISITS#PK (cr=902 pr=0 pw=0 time=300356 us)(object id 52949)

The query with the ALL_ROWS hint returns data instantly, while the other one takes about 70 times as long.

Interestingly enough BOTH queries generate plans with estimates that are WAY off. The first plan is estimating 2 rows, while the second plan is estimating 490 rows. However the real number of rows is correctly reported in the Row Source Generation as 12711 (after the join operation).

TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
USER_VISITS                        196044       1049

INDEX_NAME                         BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR LAST_ANALYZED
------------------------------ ---------- ----------- ------------- ----------------- -------------------
USER_VISITS#PK                          2         860        196002          57761 07/24/2009 13:17:59

COLUMN_NAME                    NUM_DISTINCT LOW_VALUE            HIGH_VALUE                                 DENSITY     NUM_NULLS HISTOGRAM
------------------------------ ------------ -------------------- -------------------- -------------------------------- ---------- ---------------
VISIT_DATE                           195900 786809010E0910       786D0609111328                      .0000051046452272          0 NONE

I don’t know how the first one is estimating 2 rows, but I can compute the second’s cardinality estimates by assuming a 5% selectivity for the TO_DATE() functions:

SQL > SELECT ROUND(0.05*0.05*196044) FROM DUAL;

ROUND(0.05*0.05*196044)
-----------------------
                    490

However, removing the bind variables (and clearing the shared pool), does not change the cardinality estimates at all. I would like to avoid hinting this plan if possible and that is why I’m looking for advice. I also have a followup question.

Per one of Jonathan Lewis’ articles, adding a ROWNUM predicate in the WHERE clause causes the optimizer to switch to a FIRST ROWS n optimizer mode, as if a /*+ FIRST_ROWS(n) */ hint were provided for the query. To work around this issue, changing the query to use a ROW_NUMBER analytic function rather than the ROWNUM pseudo column could prevent the optimizer from switching to first rows optimization.

Another question you might ask is if bind peeking enabled? It is enabled by default on Oracle 10g (and 9i). Are there automatically generated histograms on any of the columns in the WHERE clause? Do any of the tables involved have out of date statistics (or cases where the statistics as of 10 PM do not match the statistics as of the current time)? Is it possible to examine 10053 trace files for the SQL statement?

Testing with either a CARDINALITY or OPT_ESTIMATE hint to see if the execution plan changes dramatically to improve performance is another possible area of investigation. The question then becomes whether this will be sufficient to over-rule the first rows optimizer so that it does not use an index access which will avoid a sort.

There may also be a problem with the accuracy of the EXPLAIN PLAN in SQL*Plus due to the bind variables in the SQL statement. If I recall correctly, explain plan does not peek at bind variables, as happens when the query is actually executed (but that may have been a 9i problem, where 10g uses DBMS_XPLAN for explain plans), and that SQL*Plus passes in numeric bind variables as if they are VARCHAR2. (I have an example in 11.1.0.7 which shows that this might not be the case).

The original poster provided a great test case to demonstrate what is happening:

DDL/DML

CREATE TABLE TEST_AGGR
(
 ID NUMBER
,  VAL DATE
,  PAD VARCHAR2(500)
,  ID2 NUMBER
);

CREATE TABLE TEST_JOIN
(
        ID2 NUMBER
,       PAD VARCHAR2(5)
,       PROD VARCHAR2(15)
);

CREATE SEQUENCE TEST_SEQUENCE;

INSERT INTO TEST_AGGR
SELECT TEST_SEQUENCE.NEXTVAL
, SYSDATE + DBMS_RANDOM.VALUE*100
, RPAD('X',499,'0')
, MOD(ROWNUM,78)
FROM DUAL
CONNECT BY ROWNUM <= 195000;

INSERT INTO TEST_JOIN
SELECT  ID2
,  
 (CASE
  WHEN ROWNUM IN (3,5) THEN 'Y'
  ELSE RPAD('X',4,'0')
 END)
, (CASE
  WHEN ROWNUM = 1 THEN 'test_prod'
  WHEN ROWNUM BETWEEN 2 AND 3 THEN 'DUPLICATE A1'
  WHEN ROWNUM BETWEEN 4 AND 5 THEN 'A DUPLICATE A2'
  ELSE UPPER(DBMS_RANDOM.STRING('A',15)) END)
FROM  (SELECT DISTINCT ID2 FROM TEST_AGGR);

CREATE INDEX TEST_AGGR_IDX ON TEST_AGGR(ID,VAL,ID2);
CREATE UNIQUE INDEX TEST_JOIN_UIDX ON TEST_JOIN(PROD,PAD);

 Statistics Gathering

EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'TEST_AGGR',cascade=>true);
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'TEST_JOIN',cascade=>true);

Bind Variable Setup

var a VARCHAR2(25);
var b VARCHAR2(25);

exec :a := '08/01/2009';
exec :b := '08/27/2009';

Non-hinted Query

SELECT /*repeatable_case_nohint*/ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1

 Non-hinted XPLAN Output

SQL_ID  9smb486cd31b2, child number 0
-------------------------------------
SELECT /*repeatable_case_nohint*/ PROD FROM (  SELECT TEST_JOIN.PROD  , COUNT(*) CNT  FROM TEST_AGGR  JOIN TEST_JOIN ON
TEST_JOIN.ID2 = TEST_AGGR.ID2  WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')  AND PROD <> 'TEST_PROD'
GROUP BY TEST_JOIN.PROD  ORDER BY CNT DESC ) WHERE ROWNUM <= 1

Plan hash value: 301623847

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                |      1 |        |      1 |00:00:07.33 |   57334 |       |       |          |
|   2 |   VIEW                            |                |      1 |      2 |      1 |00:00:07.33 |   57334 |       |       |          |
|*  3 |    FILTER                         |                |      1 |        |      1 |00:00:07.33 |   57334 |       |       |          |
|   4 |     SORT ORDER BY                 |                |      1 |      2 |      1 |00:00:07.33 |   57334 | 73728 | 73728 |          |
|   5 |      SORT GROUP BY NOSORT         |                |      1 |      2 |     76 |00:00:07.33 |   57334 |       |       |          |
|   6 |       NESTED LOOPS                |                |      1 |      2 |  50637 |00:00:07.24 |   57334 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| TEST_JOIN      |      1 |     77 |     78 |00:00:00.01 |    2 |          |       |          |
|*  8 |         INDEX FULL SCAN           | TEST_JOIN_UIDX |      1 |      1 |     78 |00:00:00.01 |    1 |          |       |          |
|*  9 |        INDEX FULL SCAN            | TEST_AGGR_IDX  |     78 |      2 |  50637 |00:00:07.15 |   57332 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=1)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   8 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   9 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))

Hinted Query

SELECT /*+ ALL_ROWS repeatable_case_hint*/ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1

Hinted XPLAN Output

SQL_ID  283wx8s0d04kn, child number 0
-------------------------------------
SELECT /*+ ALL_ROWS repeatable_case_hint*/ PROD FROM (  SELECT TEST_JOIN.PROD  , COUNT(*) CNT  FROM TEST_AGGR  JOIN
TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2  WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')  AND
PROD <> 'TEST_PROD'  GROUP BY TEST_JOIN.PROD  ORDER BY CNT DESC ) WHERE ROWNUM <= 1

Plan hash value: 828673535

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
---------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY             |               |      1 |        |      1 |00:00:00.34 |     772 |       |       |          |
|   2 |   VIEW                     |               |      1 |     75 |      1 |00:00:00.34 |     772 |       |       |          |
|*  3 |    FILTER                  |               |      1 |        |      1 |00:00:00.34 |     772 |       |       |          |
|   4 |     SORT ORDER BY          |               |      1 |     75 |      1 |00:00:00.34 |     772 | 73728 | 73728 |          |
|   5 |      HASH GROUP BY         |               |      1 |     75 |     76 |00:00:00.34 |     772 |   808K|   808K|     1/0/0|
|*  6 |       HASH JOIN            |               |      1 |  50069 |  50637 |00:00:00.26 |     772 |   842K|   842K|     1/0/0|
|*  7 |        TABLE ACCESS FULL   | TEST_JOIN     |      1 |     77 |     78 |00:00:00.01 |       3 |       |       |          |
|*  8 |        INDEX FAST FULL SCAN| TEST_AGGR_IDX |      1 |  50737 |  50637 |00:00:00.10 |     769 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=1)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   6 - access("TEST_JOIN"."ID2"="TEST_AGGR"."ID2")
   7 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   8 - filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY')))

Observations

The cardinality estimates in the non-hinted plan are way, way off. However, in the hinted plan the cardinalities are nearly correct. In the non-hinted plan the cardinality estimates are incorrect at steps 8 and 9.

I tried to work out the cardinality estimate for the access of TEST_JOIN_UIDX by first calculating the “Effective Index Selectivity” based on the following data:

TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT    DENSITY  NUM_NULLS
------------------------------ ------------------------------ ------------ ---------- ----------
TEST_JOIN                      PROD                                     76 .013157895          0

Since in step 8 of the plan we are filtering based on a != condition we want ALL the rows except that one. Since the number of rows in this roughly equal the number of distinct values I calculated a rough effective index selectivity estimate of:

(1-0.13157895)*78 = 67.7368419

This is right in line with a logical understanding of the data.

Based on the following data:

TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT
------------------------------ ------------------------------ ------------
TEST_AGGR                      VAL                                  194588

I would expect the following cardinality out of the TEST_AGGR_IDX index:

SQL > SELECT (TO_DATE(:b,'MM/DD/YYYY')-TO_DATE(:a,'MM/DD/YYYY'))/(MAX(VAL)-MIN(VAL))+2/194588 AS SELECTIVITY FROM TEST_AGGR;

SELECTIVITY
-----------
 .260014672

SQL > SELECT .260014672*195128 AS CARDINALITY FROM DUAL;

CARDINALITY
-----------
 50736.1429

This almost exactly matches the HINTED plan (error due to rounding).

Based on the estimates in the non-hinted plan I would have expected a 5% * 5% to occur but my calculations don’t match:

SQL > select .05*.05*(1/78)*195128 AS CARDINALITY from dual;

CARDINALITY
-----------
 6.25410256

After playing with the test case for a little while, I recalled seeing this behavior in the past with ROWNUM in the WHERE clause.  What is happening is that the ROWNUM<=1 predicate is being pushed into the inline view. Take a look (at the E-Rows column):

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD <> 'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                |      1 |        |      1 |00:00:02.84 |   57178 |       |       |          |
|   2 |   VIEW                            |                |      1 |      2 |      1 |00:00:02.84 |   57178 |       |       |          |
|*  3 |    FILTER                         |                |      1 |        |      1 |00:00:02.84 |   57178 |       |       |          |
|   4 |     SORT ORDER BY                 |                |      1 |      2 |      1 |00:00:02.84 |   57178 | 73728 | 73728 |          |
|   5 |      SORT GROUP BY NOSORT         |                |      1 |      2 |     76 |00:00:02.80 |   57178 |       |       |          |
|   6 |       NESTED LOOPS                |                |      1 |      2 |  50596 |00:00:03.04 |   57178 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| TEST_JOIN      |      1 |     77 |     78 |00:00:00.01 |       2 |       |       |          |
|*  8 |         INDEX FULL SCAN           | TEST_JOIN_UIDX |      1 |      1 |     78 |00:00:00.01 |       1 |       |       |          |
|*  9 |        INDEX FULL SCAN            | TEST_AGGR_IDX  |     78 |      2 |  50596 |00:00:02.88 |   57176 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=1)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   8 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   9 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 2;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                |      1 |        |      2 |00:00:02.70 |   57178 |       |       |          |
|   2 |   VIEW                            |                |      1 |      3 |      2 |00:00:02.70 |   57178 |       |       |          |
|*  3 |    FILTER                         |                |      1 |        |      2 |00:00:02.70 |   57178 |       |       |          |
|   4 |     SORT ORDER BY                 |                |      1 |      3 |      2 |00:00:02.70 |   57178 |  9216 |  9216 | 8192  (0)|
|   5 |      SORT GROUP BY NOSORT         |                |      1 |      3 |     76 |00:00:02.73 |   57178 |       |       |          |
|   6 |       NESTED LOOPS                |                |      1 |      3 |  50596 |00:00:02.88 |   57178 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| TEST_JOIN      |      1 |     77 |     78 |00:00:00.01 |       2 |       |       |          |
|*  8 |         INDEX FULL SCAN           | TEST_JOIN_UIDX |      1 |      1 |     78 |00:00:00.01 |       1 |       |       |          |
|*  9 |        INDEX FULL SCAN            | TEST_AGGR_IDX  |     78 |      3 |  50596 |00:00:02.73 |   57176 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=2)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   8 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   9 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 3;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                |      1 |        |      3 |00:00:02.71 |   57178 |       |       |          |
|   2 |   VIEW                            |                |      1 |      4 |      3 |00:00:02.71 |   57178 |       |       |          |
|*  3 |    FILTER                         |                |      1 |        |      3 |00:00:02.71 |   57178 |       |       |          |
|   4 |     SORT ORDER BY                 |                |      1 |      4 |      3 |00:00:02.71 |   57178 |  9216 |  9216 | 8192  (0)|
|   5 |      SORT GROUP BY NOSORT         |                |      1 |      4 |     76 |00:00:02.67 |   57178 |       |       |          |
|   6 |       NESTED LOOPS                |                |      1 |      4 |  50596 |00:00:02.93 |   57178 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| TEST_JOIN      |      1 |     77 |     78 |00:00:00.01 |       2 |       |       |          |
|*  8 |         INDEX FULL SCAN           | TEST_JOIN_UIDX |      1 |      1 |     78 |00:00:00.01 |       1 |       |       |          |
|*  9 |        INDEX FULL SCAN            | TEST_AGGR_IDX  |     78 |      4 |  50596 |00:00:02.83 |   57176 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=3)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   8 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   9 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 50;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                |      1 |        |     50 |00:00:02.72 |   57178 |       |       |          |
|   2 |   VIEW                            |                |      1 |     51 |     50 |00:00:02.72 |   57178 |       |       |          |
|*  3 |    FILTER                         |                |      1 |        |     50 |00:00:02.72 |   57178 |       |       |          |
|   4 |     SORT ORDER BY                 |                |      1 |     51 |     50 |00:00:02.72 |   57178 |  9216 |  9216 | 8192  (0)|
|   5 |      SORT GROUP BY NOSORT         |                |      1 |     51 |     76 |00:00:02.65 |   57178 |       |       |          |
|   6 |       NESTED LOOPS                |                |      1 |     51 |  50596 |00:00:02.88 |   57178 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| TEST_JOIN      |      1 |     77 |     78 |00:00:00.01 |       2 |       |       |          |
|*  8 |         INDEX FULL SCAN           | TEST_JOIN_UIDX |      1 |      1 |     78 |00:00:00.01 |       1 |       |       |          |
|*  9 |        INDEX FULL SCAN            | TEST_AGGR_IDX  |     78 |     51 |  50596 |00:00:02.78 |   57176 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=50)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   8 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   9 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 74;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY                    |                |      1 |        |   74 |00:00:02.71 |   57178 |       |       |          |
|   2 |   VIEW                            |                |      1 |     75 |   74 |00:00:02.71 |   57178 |       |       |          |
|*  3 |    FILTER                         |                |      1 |        |   74 |00:00:02.71 |   57178 |       |       |          |
|   4 |     SORT ORDER BY                 |                |      1 |     75 |   74 |00:00:02.71 |   57178 |  9216 |  9216 | 8192  (0)|
|   5 |      SORT GROUP BY NOSORT         |                |      1 |     75 |   76 |00:00:02.66 |   57178 |       |       |          |
|   6 |       NESTED LOOPS                |                |      1 |     75 |50596 |00:00:02.93 |   57178 |       |       |          |
|   7 |        TABLE ACCESS BY INDEX ROWID| TEST_JOIN      |      1 |     77 |   78 |00:00:00.01 |       2 |       |       |          |
|*  8 |         INDEX FULL SCAN           | TEST_JOIN_UIDX |      1 |      1 |   78 |00:00:00.01 |       1 |       |       |          |
|*  9 |        INDEX FULL SCAN            | TEST_AGGR_IDX  |     78 |     75 |50596 |00:00:02.83 |   57176 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=74)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   8 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   9 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 75;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
|*  1 |  COUNT STOPKEY          |               |      1 |        |     75 |00:00:00.06 |     741 |       |       |          |
|   2 |   VIEW                  |               |      1 |     75 |     75 |00:00:00.06 |     741 |       |       |          |
|*  3 |    FILTER               |               |      1 |        |     75 |00:00:00.06 |     741 |       |       |          |
|   4 |     SORT ORDER BY       |               |      1 |     75 |     75 |00:00:00.06 |     741 |  9216 |  9216 | 8192  (0)|
|   5 |      HASH GROUP BY      |               |      1 |     75 |     76 |00:00:00.06 |     741 |   963K|   963K| 2375K (0)|
|*  6 |       HASH JOIN         |               |      1 |  50074 |  50596 |00:00:00.30 |     741 |  1035K|  1035K| 1164K (0)|
|*  7 |        TABLE ACCESS FULL| TEST_JOIN     |      1 |     77 |     78 |00:00:00.01 |       7 |       |       |          |
|*  8 |        INDEX FULL SCAN  | TEST_AGGR_IDX |      1 |  50742 |  50596 |00:00:00.10 |     734 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=75)
   3 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   6 - access("TEST_JOIN"."ID2"="TEST_AGGR"."ID2")
   7 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   8 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY')))

Compare the above output with what follows from Oracle 11.2.0.1, with the following parameters set:

ALTER SESSION SET optimizer_index_caching=0;
ALTER SESSION SET optimizer_index_cost_adj=100;
ALTER SESSION SET TRACEFILE_IDENTIFIER='optimizer_test100';
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1';

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |               |      1 |        |      1 |00:00:00.03 |     747 |       |       |          |
|*  1 |  COUNT STOPKEY             |               |      1 |        |      1 |00:00:00.03 |     747 |       |       |          |
|   2 |   VIEW                     |               |      1 |     75 |      1 |00:00:00.03 |     747 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY   |               |      1 |     75 |      1 |00:00:00.03 |     747 | 73728 | 73728 |          |
|   4 |     HASH GROUP BY          |               |      1 |     75 |     76 |00:00:00.03 |     747 |   964K|   964K| 2491K (0)|
|*  5 |      FILTER                |               |      1 |        |  44832 |00:00:00.06 |     747 |       |       |          |
|*  6 |       HASH JOIN            |               |      1 |  44262 |  44832 |00:00:00.05 |     747 |  1035K|  1035K| 1265K (0)|
|*  7 |        TABLE ACCESS FULL   | TEST_JOIN     |      1 |     77 |     78 |00:00:00.01 |       3 |       |       |          |
|*  8 |        INDEX FAST FULL SCAN| TEST_AGGR_IDX |      1 |  44852 |  44832 |00:00:00.02 |     744 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=1)
   3 - filter(ROWNUM<=1)
   5 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   6 - access("TEST_JOIN"."ID2"="TEST_AGGR"."ID2")
   7 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   8 - filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY')))

– 

ALTER SESSION SET optimizer_index_caching=90;
ALTER SESSION SET optimizer_index_cost_adj=30;
ALTER SESSION SET TRACEFILE_IDENTIFIER='optimizer_test30';

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |               |      1 |        |      1 |00:00:00.03 |     747 |       |       |          |
|*  1 |  COUNT STOPKEY             |               |      1 |        |      1 |00:00:00.03 |     747 |       |       |          |
|   2 |   VIEW                     |               |      1 |     75 |      1 |00:00:00.03 |     747 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY   |               |      1 |     75 |      1 |00:00:00.03 |     747 | 73728 | 73728 |          |
|   4 |     HASH GROUP BY          |               |      1 |     75 |     76 |00:00:00.03 |     747 |   964K|   964K| 2501K (0)|
|*  5 |      FILTER                |               |      1 |        |  44832 |00:00:00.06 |     747 |       |       |          |
|*  6 |       HASH JOIN            |               |      1 |  44262 |  44832 |00:00:00.05 |     747 |  1035K|  1035K| 1286K (0)|
|*  7 |        TABLE ACCESS FULL   | TEST_JOIN     |      1 |     77 |     78 |00:00:00.01 |       3 |       |       |          |
|*  8 |        INDEX FAST FULL SCAN| TEST_AGGR_IDX |      1 |  44852 |  44832 |00:00:00.02 |     744 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=1)
   3 - filter(ROWNUM<=1)
   5 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   6 - access("TEST_JOIN"."ID2"="TEST_AGGR"."ID2")
   7 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   8 - filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY')))

 

ALTER SESSION SET optimizer_index_caching=100;
ALTER SESSION SET optimizer_index_cost_adj=1;
ALTER SESSION SET TRACEFILE_IDENTIFIER='optimizer_test1';

 

SELECT /*+ gather_plan_statistics */ PROD
FROM
(
 SELECT TEST_JOIN.PROD
 , COUNT(*) CNT
 FROM TEST_AGGR
 JOIN TEST_JOIN ON TEST_JOIN.ID2 = TEST_AGGR.ID2
 WHERE VAL BETWEEN TO_DATE(:a,'MM/DD/YYYY') AND TO_DATE(:b,'MM/DD/YYYY')
 AND PROD  'TEST_PROD'
 GROUP BY TEST_JOIN.PROD
 ORDER BY CNT DESC
)
WHERE ROWNUM <= 1;

select * from TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |               |      1 |        |      1 |00:00:01.16 |   57034 |       |       |          |
|*  1 |  COUNT STOPKEY          |               |      1 |        |      1 |00:00:01.16 |   57034 |       |       |          |
|   2 |   VIEW                  |               |      1 |     75 |      1 |00:00:01.16 |   57034 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY|               |      1 |     75 |      1 |00:00:01.16 |   57034 | 73728 | 73728 |          |
|   4 |     HASH GROUP BY       |               |      1 |     75 |     76 |00:00:01.16 |   57034 |   964K|   964K| 2506K (0)|
|*  5 |      FILTER             |               |      1 |        |  44832 |00:00:01.17 |   57034 |       |       |          |
|   6 |       NESTED LOOPS      |               |      1 |  44262 |  44832 |00:00:01.16 |   57034 |       |       |          |
|*  7 |        TABLE ACCESS FULL| TEST_JOIN     |      1 |     77 |     78 |00:00:00.01 |       3 |       |       |          |
|*  8 |        INDEX FULL SCAN  | TEST_AGGR_IDX |     78 |    575 |  44832 |00:00:01.14 |   57031 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=1)
   3 - filter(ROWNUM<=1)
   5 - filter(TO_DATE(:A,'MM/DD/YYYY')<=TO_DATE(:B,'MM/DD/YYYY'))
   7 - filter("TEST_JOIN"."PROD"<>'TEST_PROD')
   8 - access("TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND "TEST_JOIN"."ID2"="TEST_AGGR"."ID2" AND
              "TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY'))
       filter(("TEST_AGGR"."VAL"<=TO_DATE(:B,'MM/DD/YYYY') AND "TEST_AGGR"."VAL">=TO_DATE(:A,'MM/DD/YYYY') AND
              "TEST_JOIN"."ID2"="TEST_AGGR"."ID2"))




SQL – ROW_NUMBER, MOD, Even Distribution

9 12 2009

December 9, 2009

A question appeared in the comp.databases.oracle.server Usenet group a couple years ago:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/41ee30267e646e10

Say I have a bunch of bowling players of different skill level as indicated by his avg_score in the table below.

I need to allot them into n teams (say 8), of equivalent strength on the TEAM level so no team ends up with mostly high-scorers and vic-versa.
(let’s say players may not be evenly divided into teams because n numbers are “sick”)

Is there a way to do to this ?

10gR2> create table players (id integer primary key, avg_score number, team_no integer);

10gR2> desc players
Name      Type
--------- -------
ID        INTEGER
AVG_SCORE NUMBER
TEAM_NO   INTEGER

 

10gR2> BEGIN
  2    FOR i IN 1..120
  3    LOOP
  4        INSERT INTO players (id, avg_score) VALUES(i,round(dbms_random.value(75,295)));
  5    END LOOP;
  6  END ;
  7  /

Needs work, but may be enough to get you started:

SELECT
  ID,
  AVG_SCORE,
  ROW_NUMBER() OVER (ORDER BY AVG_SCORE) RANKING,
  COUNT(*) OVER (PARTITION BY 1) ROWS_COUNT
FROM
  PLAYERS;

        ID  AVG_SCORE    RANKING ROWS_COUNT
---------- ---------- ---------- ----------
        74         78          1        120
        91         82          2        120
        95         83          3        120
        77         86          4        120
        61         87          5        120
        23         87          6        120
         1         90          7        120
        67         91          8        120
        62         97          9        120
        33         98         10        120
...
        88        271        111        120
        41        272        112        120
       104        274        113        120
        32        275        114        120
        36        275        115        120
        99        276        116        120
        71        277        117        120
        31        285        118        120
         3        286        119        120
       113        288        120        120

If we were to take the people at rank 1 and rank 120, they would have roughly the same average as the people at rank 2 and rank 119, and they would have roughly the same average as the people at rank 3 and 118, etc.  This does not work exactly as planned as the number of people must be evenly divisible by 2 * the number of groups, and this is not the case with 120 people and 8 groups.

We can have Oracle skip from 1 to 9 to 17 to … by using the MOD function, but we must recognize the mid-point so that we can switch the formula.

By sliding the above into an inline view, we can perform the analysis that is required.  I included three additional columns to help determine whether or not the formula is close:

SELECT
  ID,
  AVG_SCORE,
  DECODE(SIGN(RANKING-(ROWS_COUNT/2)),-1,MOD(RANKING-1,8)+1,((8-1)-MOD(RANKING-(8/2),8))+1) TEAM_NO,
  RANKING,
  SUM(AVG_SCORE) OVER (PARTITION BY DECODE(SIGN(RANKING-(ROWS_COUNT/2)),-1,MOD(RANKING-1,8)+1,((8-1)-MOD(RANKING-(8/2),8))+1)) TEAM_AVG,
  COUNT(*) OVER (PARTITION BY DECODE(SIGN(RANKING-(ROWS_COUNT/2)),-1,MOD(RANKING-1,8)+1,((8-1)-MOD(RANKING-(8/2),8))+1)) NUM_TEAM_MEMBERS
FROM
  (SELECT
    ID,
    AVG_SCORE,
    ROW_NUMBER() OVER (ORDER BY AVG_SCORE) RANKING,
    COUNT(*) OVER (PARTITION BY 1) ROWS_COUNT
  FROM
    PLAYERS)
ORDER BY
  RANKING;

        ID  AVG_SCORE    TEAM_NO    RANKING   TEAM_AVG NUM_TEAM_MEMBERS
---------- ---------- ---------- ---------- ---------- ----------------
        74         78          1          1       2603 15
        91         82          2          2       2602 15
        95         83          3          3       2592 15
        77         86          4          4       2709 15
        61         87          5          5       2701 15
        23         87          6          6       2690 15
         1         90          7          7       2686 15
        67         91          8          8       2689 15
        62         97          1          9       2603 15
        33         98          2         10       2602 15
        79         98          3         11       2592 15
       120        100          4         12       2709 15
         2        101          5         13       2701 15
        39        101          6         14       2690 15
        60        102          7         15       2686 15
       101        104          8         16       2689 15
...
        14        257          8        108       2689 15
        59        259          7        109       2686 15
        29        262          6        110       2690 15
        88        271          5        111       2701 15
        41        272          4        112       2709 15
       104        274          3        113       2592 15
        32        275          2        114       2602 15
        36        275          1        115       2603 15
        99        276          8        116       2689 15
        71        277          7        117       2686 15
        31        285          6        118       2690 15
         3        286          5        119       2701 15
       113        288          4        120       2709 15




SQL – ROW_NUMBER Analytical Function, Many to One Join

9 12 2009

December 9, 2009

A question appeared in the comp.databases.oracle.server Usenet group a couple years ago:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/deeb9d899c8e6376

I have a problem filtering my results which involves a many-to-many join. Below is a brief layout of the problem, but briefly, I have a test that can have multiple results and those results may also belong multiple tests (i have done the same test under a different id and I want to inherit the previous test result). I also have an event table that is required to keep track of all changes.

The problem I have is that I cannot filter my results so where a result belongs to 2 tests I don’t want the details where it belongs to the test in the query. If anyone can suggest a way of returning 2 lines instead of 3 I would be very grateful.

SELECT tr.test_acc, e.event_id, re.resultevent_id, re.result_id,
re2.resultevent_id, re2.event_id, tr2.test_acc
FROM TestRequest tr JOIN Event e ON tr.test_acc=e.test_acc
JOIN ResultEvent re ON e.event_id=re.event_id
JOIN ResultEvent re2 ON re.result_id=re2.result_id
JOIN Event e2 ON e2.event_id=re2.event_id
JOIN TestRequest tr2 ON e2.test_acc=tr2.test_acc
WHERE tr.test_acc=3418;

+----------+----------+----------------+-----------+----------------+----------+----------+
| test_acc | event_id | resultevent_id | result_id | resultevent_id | event_id | test_acc |
+----------+----------+----------------+-----------+----------------+----------+----------+
|     3418 |    42178 |           6345 |      6321 |           6345 |    42178 |     3418 |
|     3418 |    42179 |           6346 |      4126 |           4126 |    28004 |     2248 |
|     3418 |    42179 |           6346 |      4126 |           6346 |    42179 |     3418 |
+----------+----------+----------------+-----------+----------------+----------+----------+
 
3 rows in set (0.00 sec)

 

+-------------+
| TestRequest |
+-------------+
| test_acc    |
+-------------+
     1 |
       |
     * |
+-------------+
| Event       |
+-------------+
| test_acc    |
| event_id    |
+-------------+
     1 |
       |
     * |
+----------------+
| ResultEvent    |
+----------------+
| resultevent_id |
| event_id       |
| result_id      |
+----------------+

I think that I understand what you are trying to do.  I changed the table names slightly (added _) in the mock up:
First, the table creation:

CREATE TABLE TEST_REQUEST (
  TEST_ACC NUMBER(10));

CREATE TABLE EVENT (
  TEST_ACC NUMBER(10),
  EVENT_ID NUMBER(10));

CREATE TABLE RESULT_EVENT (
  RESULTEVENT_ID NUMBER(10),
  EVENT_ID NUMBER(10),
  RESULT_ID NUMBER(10));

As best I can tell, the data that is in the tables:

INSERT INTO TEST_REQUEST VALUES (3418);
INSERT INTO TEST_REQUEST VALUES (2248);

INSERT INTO EVENT VALUES (3418,42178);
INSERT INTO EVENT VALUES (3418,42179);
INSERT INTO EVENT VALUES (2248,28004);

INSERT INTO RESULT_EVENT VALUES (6345,42178,6321);
INSERT INTO RESULT_EVENT VALUES (4126,28004,4126);
INSERT INTO RESULT_EVENT VALUES (6346,42179,4126);

COMMIT;

I reformatted your query so that I could more easily see what is happening:

SELECT
  TR.TEST_ACC,
  E.EVENT_ID,
  RE.RESULTEVENT_ID,
  RE.RESULT_ID,
  RE2.RESULTEVENT_ID,
  RE2.EVENT_ID,
  TR2.TEST_ACC
FROM
  TEST_REQUEST TR,
  EVENT E,
  RESULT_EVENT RE,
  RESULT_EVENT RE2,
  EVENT E2,
  TEST_REQUEST TR2
WHERE
  TR.TEST_ACC=3418
  AND TR.TEST_ACC=E.TEST_ACC
  AND E.EVENT_ID=RE.EVENT_ID
  AND RE.RESULT_ID=RE2.RESULT_ID
  AND E2.EVENT_ID=RE2.EVENT_ID
  AND E2.TEST_ACC=TR2.TEST_ACC;

  TEST_ACC   EVENT_ID RESULTEVENT_ID  RESULT_ID RESULTEVENT_ID   EVENT_ID  TEST_ACC
---------- ---------- -------------- ---------- -------------- ---------- ----------
      3418      42179           6346       4126           4126      28004      2248
      3418      42179           6346       4126           6346      42179      3418
      3418      42178           6345       6321           6345      42178      3418

It appears that if you have 2 rows with the same RESULT_ID, you only want the first EVENT_ID, so you need to have some way of numbering the rows.  The ROW_NUMBER analytical function might be able to help:

SELECT
  TR.TEST_ACC,
  E.EVENT_ID,
  RE.RESULTEVENT_ID,
  RE.RESULT_ID,
  RE2.RESULTEVENT_ID,
  RE2.EVENT_ID,
  TR2.TEST_ACC,
  ROW_NUMBER() OVER (PARTITION BY RE.RESULT_ID ORDER BY RE2.EVENT_ID) RN
FROM
  TEST_REQUEST TR,
  EVENT E,
  RESULT_EVENT RE,
  RESULT_EVENT RE2,
  EVENT E2,
  TEST_REQUEST TR2
WHERE
  TR.TEST_ACC=3418
  AND TR.TEST_ACC=E.TEST_ACC
  AND E.EVENT_ID=RE.EVENT_ID
  AND RE.RESULT_ID=RE2.RESULT_ID
  AND E2.EVENT_ID=RE2.EVENT_ID
  AND E2.TEST_ACC=TR2.TEST_ACC;

  TEST_ACC   EVENT_ID RESULTEVENT_ID  RESULT_ID RESULTEVENT_ID   EVENT_ID TEST_ACC   RN
---------- ---------- -------------- ---------- -------------- ---------- ---------- --
      3418      42179           6346       4126           4126      28004      2248   1
      3418      42179           6346       4126           6346      42179      3418   2
      3418      42178           6345       6321           6345      42178      3418   1

Now, if we can filter out any of the rows that do not have RN=1, we may have a usable solution.  By sliding the above into an inline view (and adding column aliases as necessary in the inline view), we can add a WHERE clause to return only those rows with RN=1:

SELECT
  TEST_ACC,
  EVENT_ID,
  RESULTEVENT_ID,
  RESULT_ID,
  RESULTEVENT_ID2,
  EVENT_ID2,
  TEST_ACC2
FROM
  (SELECT
    TR.TEST_ACC,
    E.EVENT_ID,
    RE.RESULTEVENT_ID,
    RE.RESULT_ID,
    RE2.RESULTEVENT_ID RESULTEVENT_ID2,
    RE2.EVENT_ID EVENT_ID2,
    TR2.TEST_ACC TEST_ACC2,
    ROW_NUMBER() OVER (PARTITION BY RE.RESULT_ID ORDER BY RE2.EVENT_ID) RN
  FROM
    TEST_REQUEST TR,
    EVENT E,
    RESULT_EVENT RE,
    RESULT_EVENT RE2,
    EVENT E2,
    TEST_REQUEST TR2
  WHERE
    TR.TEST_ACC=3418
    AND TR.TEST_ACC=E.TEST_ACC
    AND E.EVENT_ID=RE.EVENT_ID
    AND RE.RESULT_ID=RE2.RESULT_ID
    AND E2.EVENT_ID=RE2.EVENT_ID
    AND E2.TEST_ACC=TR2.TEST_ACC)
WHERE
  RN=1;

  TEST_ACC   EVENT_ID RESULTEVENT_ID  RESULT_ID RESULTEVENT_ID2  EVENT_ID2 TEST_ACC2
---------- ---------- -------------- ---------- --------------- ---------- ----------
      3418      42179           6346       4126            4126      28004      2248
      3418      42178           6345       6321            6345      42178      3418




SQL – ROW_NUMBER – Retrieving Rows from the Midpoint

9 12 2009

December 9, 2009

A question appeared in the comp.databases.oracle.server Usenet group a couple of years ago:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/a8314487f8c414ac

I have a question about rownum which is not mentioned.

Lets say that I want to extract a 3 records out of a result where the middle one is the query.
For example, I have records:

ID      Value 
--      ------
1       Hello 
2       Hi 
3       Wow 
4       Shrek 
5       Shus 
6       What?

And I need a query that by receiving ID, it will extract the record with the id and the previous one to that and the next one. For example for selecting ID=4, it will return records: 3,4,5.

Is there a way to do it in Oracle?

But the ids are not seqeuntial – it can verify from each record, for example:

ID      Value 
225     Hello 
228     Hi 
258     Wow 
240     Shrek 
259     Shus 
230     What?

 I offered the following:

Note: LAG and LEAD will collapse the result into a single row.  If that is not desired, you will need a different approach:
The set up:

CREATE TABLE T1 ( 
  C1 NUMBER(12), 
  C2 VARCHAR2(20)); 
 
INSERT INTO T1 VALUES (1,'Hello'); 
INSERT INTO T1 VALUES (2,'Hi'); 
INSERT INTO T1 VALUES (3,'Wow'); 
INSERT INTO T1 VALUES (4,'Shrek'); 
INSERT INTO T1 VALUES (5,'Shus'); 
INSERT INTO T1 VALUES (6,'What?');

A simple query using the ROW_NUMBER analytical function:

SELECT 
  C1, 
  C2, 
  ROW_NUMBER() OVER (ORDER BY C1) RN 
FROM 
  T1; 
 
        C1 C2                           RN 
---------- -------------------- ---------- 
         1 Hello                         1 
         2 Hi                            2 
         3 Wow                           3 
         4 Shrek                         4 
         5 Shus                          5 
         6 What?                         6

Sliding the above into an inline view to retrieve only those on either side of RN=4:

SELECT 
  T.C1, 
  T.C2 
FROM 
  (SELECT 
    C1, 
    C2, 
    ROW_NUMBER() OVER (ORDER BY C1) RN 
  FROM 
    T1) T 
WHERE 
  T.RN BETWEEN (4 -1) AND (4 +1);
        C1 C2 
---------- ------ 
         3 Wow 
         4 Shrek 
         5 Shus

But, the above is not exactly what you need, unless C1 (your ID column) always starts at 1 and incements by 1.  Essentially listing the inline view twice with a join fixes the problem:

SELECT 
  T.C1, 
  T.C2 
FROM 
  (SELECT 
    C1, 
    C2, 
    ROW_NUMBER() OVER (ORDER BY C1) RN 
  FROM 
    T1) T, 
  (SELECT 
    C1, 
    ROW_NUMBER() OVER (ORDER BY C1) RN 
  FROM 
    T1) T2 
WHERE 
  T2.C1=4 
  AND T.RN BETWEEN (T2.RN -1) AND (T2.RN +1); 
 
        C1 C2 
---------- ----- 
         5 Shus 
         4 Shrek 
         3 Wow

A quick test:

DELETE FROM 
  T1 
WHERE 
  C1=3;

The first method results in:

        C1 C2 
---------- ----- 
         4 Shrek 
         5 Shus 
         6 What?

The second method results in:

        C1 C2 
---------- ----- 
         5 Shus 
         4 Shrek 
         2 Hi

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A couple other people in the thread offered suggestions, including this one by Martin T.:
Maybe Analytic Functions can help:

select * from ( 
SELECT some_table.*,
 lead(ID) over (order by date) as LEAD_ID,
 lag(id) over (order by date) as LAG_ID 
from some_table 
) some_table_plus 
where ID      = :search_id 
    or LEAD_ID = :search_id 
    or LAG_ID  = :search_id 
order by date




SQL – Filling in Gaps in the Source Data 2

8 12 2009

December 8, 2009

The following question recently appeared in the comp.databases.oracle.server group:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/d9c7420e0e36231d

Could anyone show me how to write a analytical query for my requirement.

Query:

SELECT distinct bug_when,
       login_name,
       SUM(incoming_count - outgoing_count)
        OVER (PARTITION BY login_name ORDER BY bug_when) AS OPEN
FROM RM_COMP_INCOM_OUTGO_BUGIDS_GT G,
     bz_components c,
     bz_profiles p
where  G.component_id= c.ID and
c.manager = p.userid
order by 2,1 desc;

Query returns data like this:

12/04/2009 ssmit 52
12/02/2009 ssmit 48
11/30/2009 ssmit 45
11/29/2009 ssmit 42

I want the data to be like this:

12/04/2009 ssmit 52
12/03/2009 ssmit 48 -- fill the gap with previous value
12/02/2009 ssmit 48
12/01/2009 ssmit 45
11/30/2009 ssmit 45
11/30/2009 ssmit 45
11/29/2009 ssmit 42

Table Desc:

CREATE GLOBAL TEMPORARY TABLE RM_COMP_INCOM_OUTGO_BUGIDS_GT
(
  BUG_WHEN        DATE,
  COMPONENT_ID    NUMBER,
  INCOMING_COUNT  NUMBER,
  OUTGOING_COUNT  NUMBER
);

Set up a test table for experimentation (note, I changed the value for test from 42 to 12 for extra clarity and added additional rows):

CREATE TABLE T1
(
  BUG_WHEN        DATE,
  COMPONENT_ID    VARCHAR2(10),
  INCOMING_COUNT  NUMBER,
  OUTGOING_COUNT  NUMBER
);

INSERT INTO T1 VALUES (TO_DATE('12/07/2009','MM/DD/YYYY'),'ssmit',60,60);
INSERT INTO T1 VALUES (TO_DATE('12/04/2009','MM/DD/YYYY'),'ssmit',52,52);
INSERT INTO T1 VALUES (TO_DATE('12/02/2009','MM/DD/YYYY'),'ssmit',48,48);
INSERT INTO T1 VALUES (TO_DATE('11/30/2009','MM/DD/YYYY'),'ssmit',45,45);
INSERT INTO T1 VALUES (TO_DATE('11/29/2009','MM/DD/YYYY'),'ssmit',42,42);
INSERT INTO T1 VALUES (TO_DATE('01/29/2009','MM/DD/YYYY'),'test',12,12);

COMMIT;

SELECT
  *
FROM
  T1;

BUG_WHEN  COMPONENT_ INCOMING_COUNT OUTGOING_COUNT
--------- ---------- -------------- --------------
07-DEC-09 ssmit                  60             60
04-DEC-09 ssmit                  52             52
02-DEC-09 ssmit                  48             48
30-NOV-09 ssmit                  45             45
29-NOV-09 ssmit                  42             42
29-JAN-09 test                   12             12

Let’s start with using the LEAD function to peek at the next row from the T1 table.

SELECT
  COMPONENT_ID,
  BUG_WHEN,
  LEAD(BUG_WHEN,1) OVER (PARTITION BY COMPONENT_ID ORDER BY BUG_WHEN) NEXT_BUG_WHEN,
  INCOMING_COUNT
FROM
  T1;

COMPONENT_ BUG_WHEN  NEXT_BUG_ INCOMING_COUNT
---------- --------- --------- --------------
ssmit      29-NOV-09 30-NOV-09             42
ssmit      30-NOV-09 02-DEC-09             45
ssmit      02-DEC-09 04-DEC-09             48
ssmit      04-DEC-09 07-DEC-09             52
ssmit      07-DEC-09                       60
test       29-JAN-09                       12

Above needs to be modified to permit ranges by subtracting 1 from the next date.  Also we need to fix the NULL values on the last row of each COMPONENT_ID grouping.

SELECT
  COMPONENT_ID,
  BUG_WHEN,
  NVL(LEAD(BUG_WHEN,1) OVER (PARTITION BY COMPONENT_ID ORDER BY BUG_WHEN)-1,BUG_WHEN) THROUGH_BUG_WHEN,
  NVL(LEAD(BUG_WHEN,1) OVER (PARTITION BY COMPONENT_ID ORDER BY BUG_WHEN)-1,BUG_WHEN)-BUG_WHEN RANGE_DAYS,
  INCOMING_COUNT
FROM
  T1;

COMPONENT_ BUG_WHEN  THROUGH_B RANGE_DAYS INCOMING_COUNT
---------- --------- --------- ---------- --------------
ssmit      29-NOV-09 29-NOV-09          0             42
ssmit      30-NOV-09 01-DEC-09          1             45
ssmit      02-DEC-09 03-DEC-09          1             48
ssmit      04-DEC-09 06-DEC-09          2             52
ssmit      07-DEC-09 07-DEC-09          0             60
test       29-JAN-09 29-JAN-09          0             12

Now, all we need to do is find a way to generate the missing dates between the BUG_WHEN date and the THROUGH_BUG_WHEN date.  If we slide the above into an inline view and join to an inline view with a counter generator, we end up with the following:

SELECT
  T1.COMPONENT_ID,
  T1.BUG_WHEN+V1.C BUG_WHEN,
  T1.INCOMING_COUNT
FROM
  (SELECT
    COMPONENT_ID,
    BUG_WHEN,
    NVL(LEAD(BUG_WHEN,1) OVER (PARTITION BY COMPONENT_ID ORDER BY BUG_WHEN)-1,BUG_WHEN) THROUGH_BUG_WHEN,
    NVL(LEAD(BUG_WHEN,1) OVER (PARTITION BY COMPONENT_ID ORDER BY BUG_WHEN)-1,BUG_WHEN)-BUG_WHEN RANGE_DAYS,
    INCOMING_COUNT
  FROM
    T1) T1,
  (SELECT
    ROWNUM-1 C
  FROM
    DUAL
  CONNECT BY
    LEVEL<=100) V1
WHERE
  V1.C<=T1.RANGE_DAYS
ORDER BY
  T1.COMPONENT_ID,
  T1.BUG_WHEN+V1.C DESC;

COMPONENT_ BUG_WHEN  INCOMING_COUNT
---------- --------- --------------
ssmit      07-DEC-09             60
ssmit      06-DEC-09             52
ssmit      05-DEC-09             52
ssmit      04-DEC-09             52
ssmit      03-DEC-09             48
ssmit      02-DEC-09             48
ssmit      01-DEC-09             45
ssmit      30-NOV-09             45
ssmit      29-NOV-09             42
test       29-JAN-09             12




SQL – Outer Joins, Inline Views, and DENSE_RANK

8 12 2009

December 7, 2009

Some time ago the following question appeared in the comp.databases.oracle.server Usenet group:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/a2112d7bcae62df5

 Learning SQL – will appreciate any help.
 Here is the case:

Two tables with 0..n relationship

TableA                  TableB
-----------             --------------
ColA                    ColB
                        ColA_FK
                        Col_C

Data

TableA.ColA
====================
 1
 2

Table B

ColB   ColA_FK    Col_C
 11      2              12345
 12      2              99999

 Resultset:

 --------------------------------------
 ColA   ColB      Col_C
 --------------------------------------
 1       -        -
 2       12       99999

In case no data exists in TableB for ColA=11,  1 from TableA shows up in result without any data from TableB
 However, in case of 2 from ColA,  we want to record from TableB with Col_C = 99999

I can do the outer join – however, how do I limit it so it picks only the row with 9999?

Here is the test case:

create table tableA (colA number not null);

create table tableB (colB number not null, colA_fk number_not null, colC number not null); 

insert into tableA values (1);
insert into tableA values (2); 

commit;

insert into tableB values (11,2,12345);
insert into tableB values (12,2,99999); 

commit;

 

select tableA.colA, tableB.colA_fk, tableB.colC from tableA left join
tableB on tableA.colA = tableB.colA_fk where tableB.colC = 99999;

--------------------------------------
  ColA   ColA_FK      ColC
  --------------------------------------
  2       12       99999

 I would also like to display another row:

 1       -    -

 in there.

Using the suggestion offered by Pat, with a small modification:

SELECT
  TABLEA.COLA,
  TABLEB.COLA_FK,
  TABLEB.COLC
FROM
  TABLEA
LEFT JOIN
  TABLEB
ON
  TABLEA.COLA=TABLEB.COLA_FK
  AND TABLEB.COLC=99999;

      COLA    COLA_FK       COLC
---------- ---------- ----------
         2          2      99999
         1

This is the way I would commonly set up a SQL statement to meet a similar requirements:

SELECT
  TABLEA.COLA,
  TABLEB.COLA_FK,
  TABLEB.COLC
FROM
  TABLEA,
  TABLEB
WHERE
  TABLEA.COLA=TABLEB.COLA_FK(+)
  AND TABLEB.COLC(+)=99999;

      COLA    COLA_FK       COLC
---------- ---------- ----------
         1
         2          2      99999

Will the value of interest always be 99999, or will it be the highest value with a matching COLA_FK?  If you are looking for the highest value, please supply the four digit version of Oracle that you are using (10.2.0.2, 11.1.0.6, 8.1.7.3, etc.).

The original poster replied:

Aha – that is very cool.  I had figured out the

TABLEA.COLA=TABLEB.COLA_FK(+)

but did not know I could do this:

AND TABLEB.COLC(+)=99999;

As a matter of fact you are right – how did you guess – in my cases, the interest of value would be max of whatever is in TABLEB.COLC – if the row with 99999 does not exist, then the sql returns the fow with data in COLC=12345.
 Also, working with ORACLE version 9.2.0.8.

It looks like the DENSE_RANK analytical function, an inline view, and an outer join are required.

First, let’s introduce a little more data to make certain that we cannot query for a specific value of COLC and return the expected
results:

INSERT INTO TABLEA VALUES (3);
INSERT INTO TABLEB VALUES (13,2,111111);
INSERT INTO TABLEB VALUES (13,3,11);

Next, we try an experiment with the DENSE_RANK function to separate the rows by the value of COLA_FK (caused by the PARTITION BY directive) and rank the values sorted from highest to lowest (caused by the DESC directive):

SELECT
  COLB,
  COLA_FK,
  COLC,
  DENSE_RANK() OVER (PARTITION BY COLA_FK ORDER BY COLC DESC) DR
FROM
  TABLEB;

      COLB    COLA_FK       COLC         DR
---------- ---------- ---------- ----------
        13          2     111111          1
        12          2      99999          2
        11          2      12345          3
        13          3         11          1

We are only interested in the rows with DR = 1, so we need a way to eliminate the unnecessary rows.  If we slide the above SQL statement into an inline view, we are able to add a WHERE clause that restricts the results to the rows containing the highest COLC value per COLA_FK value.  We can then alias the inline view (as B), and join it to TABLEA as before:

SELECT
  TABLEA.COLA,
  B.COLA_FK,
  B.COLC
FROM
  TABLEA,
  (SELECT
    COLB,
    COLA_FK,
    COLC,
    DENSE_RANK() OVER (PARTITION BY COLA_FK ORDER BY COLC DESC) DR
  FROM
    TABLEB) B
WHERE
  TABLEA.COLA=B.COLA_FK(+)
  AND B.DR(+)=1;

      COLA    COLA_FK       COLC
---------- ---------- ----------
         2          2     111111
         3          3         11
         1 




SQL – Running Sum, Skipping Weekends

8 12 2009

December 7, 2009

Some time ago the following question appeared on the comp.databases.oracle.misc Usenet group: http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/cad8621182ae7d1b

This sounds like it should be super simple, but I can’t think how to do it using plain SQL.

I am going to display some rows, and I want to number them in the display, like using rownum, except that the number only goes up when the row has some property (I don’t care if it displays or not when it doesn’t go up).

To explain, imagine

select rownum  , the_date , to_char(dates.the_date,'DY') DAY 
from my_table 
order by the_date

shows 
        1       1-Jan-2008      TUE 
        2       2-Jan-2008      WED 
        3       3-Jan-2008      THU 
        4       4-Jan-2008      FRI 
        5       5-Jan-2008      SAT 
        6       6-Jan-2008      SUN 
        7       7-Jan-2008      MON 
        8       8-Jan-2008      TUE

but I don’t want to count the weekend.  what I want to show would be the following instead

        1       1-Jan-2008      TUE 
        2       2-Jan-2008      WED 
        3       3-Jan-2008      THU 
        4       4-Jan-2008      FRI 
                5-Jan-2008      SAT 
                6-Jan-2008      SUN 
        5       7-Jan-2008      MON 
        6       8-Jan-2008      TUE

Looks simple, but how to do this in plain SQL?

I provided the following solution:

Do it with the SUM analytical function and DECODE:

SELECT 
  TO_DATE('01/01/2008','MM/DD/YYYY')+(ROWNUM-1) THE_DATE, 
  TO_CHAR(TO_DATE('01/01/2008','MM/DD/YYYY')+(ROWNUM-1),'DY') DAY 
FROM 
  DUAL 
CONNECT BY 
  LEVEL<=20; 
 
THE_DATE  DAY 
--------- --- 
01-JAN-08 TUE 
02-JAN-08 WED 
03-JAN-08 THU 
04-JAN-08 FRI 
... 
19-JAN-08 SAT 
20-JAN-08 SUN

Step 1:

SELECT 
  SUM(DECODE(DAY,'SAT',0,'SUN',0,1)) OVER (ORDER BY THE_DATE) MY_COUNT, 
  THE_DATE, 
  DAY 
FROM 
  (SELECT 
    TO_DATE('01/01/2008','MM/DD/YYYY')+(ROWNUM-1) THE_DATE, 
    TO_CHAR(TO_DATE('01/01/2008','MM/DD/YYYY')+(ROWNUM-1),'DY') DAY 
  FROM 
    DUAL 
  CONNECT BY 
    LEVEL<=20); 
 
  MY_COUNT THE_DATE  DAY 
---------- --------- --- 
         1 01-JAN-08 TUE 
         2 02-JAN-08 WED 
         3 03-JAN-08 THU 
         4 04-JAN-08 FRI 
         4 05-JAN-08 SAT 
         4 06-JAN-08 SUN 
         5 07-JAN-08 MON 
         6 08-JAN-08 TUE 
         7 09-JAN-08 WED 
         8 10-JAN-08 THU 
         9 11-JAN-08 FRI 
         9 12-JAN-08 SAT 
         9 13-JAN-08 SUN 
        10 14-JAN-08 MON 
        11 15-JAN-08 TUE 
        12 16-JAN-08 WED 
        13 17-JAN-08 THU 
        14 18-JAN-08 FRI 
        14 19-JAN-08 SAT 
        14 20-JAN-08 SUN

Now to remove the numbers that should not print, using a second DECODE:

SELECT 
  DECODE(DAY,'SAT',NULL,'SUN',NULL,SUM(DECODE(DAY,'SAT',0,'SUN',0,1)) OVER (ORDER BY THE_DATE)) MY_COUNT, 
  THE_DATE, 
  DAY 
FROM 
  (SELECT 
    TO_DATE('01/01/2008','MM/DD/YYYY')+(ROWNUM-1) THE_DATE, 
    TO_CHAR(TO_DATE('01/01/2008','MM/DD/YYYY')+(ROWNUM-1),'DY') DAY 
  FROM 
    DUAL 
  CONNECT BY 
    LEVEL<=20); 
 
MY THE_DATE  DAY 
-- --------- --- 
1  01-JAN-08 TUE 
2  02-JAN-08 WED 
3  03-JAN-08 THU 
4  04-JAN-08 FRI 
   05-JAN-08 SAT 
   06-JAN-08 SUN 
5  07-JAN-08 MON 
6  08-JAN-08 TUE 
7  09-JAN-08 WED 
8  10-JAN-08 THU 
9  11-JAN-08 FRI 
   12-JAN-08 SAT 
   13-JAN-08 SUN 
10 14-JAN-08 MON 
11 15-JAN-08 TUE 
12 16-JAN-08 WED 
13 17-JAN-08 THU 
14 18-JAN-08 FRI 
   19-JAN-08 SAT 
   20-JAN-08 SUN

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko provided the following solution:

SQL> alter session set nls_territory='AMERICA'; 
 
Session altered. 
 
SQL> alter session set nls_date_language='AMERICAN'; 
 
Session altered. 
 
SQL> 
SQL> with my_table as ( 
   2    select date '2008-01-01' + rownum -1 the_date 
   3    from dual connect by level <=10 
   4  ) 
   5  select 
   6  case 
   7  when 
   8  to_char(the_date,'d') not in (1,7) 
   9  then row_number() 
  10  over(partition by case when to_char(the_date,'d') not in (1,7) then 1 
  11  end order by the_date) 
  12  end row_num, 
  13  the_date, 
  14  to_char(the_date,'DY') DAY 
  15  from my_table 
  16  order by the_date; 
 
    ROW_NUM THE_DATE           DAY 
---------- ------------------ ------------ 
          1 01-JAN-08          TUE 
          2 02-JAN-08          WED 
          3 03-JAN-08          THU 
          4 04-JAN-08          FRI 
            05-JAN-08          SAT 
            06-JAN-08          SUN 
          5 07-JAN-08          MON 
          6 08-JAN-08          TUE 
          7 09-JAN-08          WED 
          8 10-JAN-08          THU

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  
Ken Denny provided the following solution:

select r_num, the_date, to_char(dates.the_date,'DY') DAY 
  from (select rownum r_num, the_date 
          from (select the_date from my_table 
                  where to_char(dates.the_date,'DY') IN 
                     ('MON','TUE','WED','THU','FRI') 
                  order by the_date) 
        union 
        select null r_num, the_date from my_table 
          where to_char(dates.the_date,'DY') IN ('SAT','SUN')) 
   order by the_date;







Follow

Get every new post delivered to your Inbox.

Join 67 other followers