Simple SQL – Finding the Next Operation

3 11 2011

November 3, 2011

An interesting request came in from an ERP mailing list – how would you write a SQL statement that indicates the next operation in a manufacturing process.  Sounds like an easy requirement.  Let’s take a look at a graphical view of one example (the graphical view is created using a program that I wrote probably 7 years ago):

The cyan colored rectangles in the above picture are the various operations (found in the OPERATION table) in the manufacturing process.  The light green colored rectangles are the material requirements (found in the REQUIREMENT table) that are used by the operation that is immediately above the sequence of light green rectangles.  The white rectangles are the sub-assembly headers for the manufacturing process.  The white rectangle at the far left is considered the main sub-assembly 0 (WORKORDER_SUB_ID = ‘0’), the middle white rectangle in this case is sub-assembly 1 (WORKORDER_SUB_ID = ‘1’), and the right-most white rectangle in this case is sub-assembly 200 (WORKORDER_SUB_ID = ‘200’).  All sub-assemblies except the main sub-assembly 0 are tied to another “parent” operation that consumes the sub-assembly just as if it were another material requirement; therefore a dummy row is added to the material requirements table (REQUIREMENT) with the SUBORD_WO_SUB_ID column set to the sub-assembly number of the connected operation.

In the above picture, on the main sub-assembly 0, operation 888 (at resource ID INSPECT) is the first operation, and operation 999 (at resource ID SHIP) is the second operation.  On sub-assembly 1, operation 10 (at resource ID 92) is the first operation, operation 20 (at resource ID 62) is the second operation, and operation 541 (at resource ID KRW) is the third operation.  On sub-assembly 200, operation 10 (at resource ID WSD70-TACK) is the first operation, operation 20 (at resource ID WELD-CBA) is the second operation, operation 40 (at resource ID BLAST) is the third operation, and operation 50 (at resource ID PAINT) is the fourth operation.

Since I am working with Oracle Database, I can construct a SQL statement using the LEAD analytic function to find the next operation number and the next resource ID:

SELECT
  *
FROM
  (SELECT
    O.WORKORDER_BASE_ID,
    O.WORKORDER_LOT_ID,
    O.WORKORDER_SPLIT_ID,
    O.WORKORDER_SUB_ID,
    O.SEQUENCE_NO,
    O.RESOURCE_ID,
    LEAD(O.SEQUENCE_NO,1) OVER (PARTITION BY O.WORKORDER_BASE_ID,
                                             O.WORKORDER_LOT_ID,
                                             O.WORKORDER_SPLIT_ID,
                                             O.WORKORDER_SUB_ID
                                ORDER BY O.SEQUENCE_NO) AS NEXT_SEQUENCE_NO,
    LEAD(O.RESOURCE_ID,1) OVER (PARTITION BY O.WORKORDER_BASE_ID,
                                             O.WORKORDER_LOT_ID,
                                             O.WORKORDER_SPLIT_ID,
                                             O.WORKORDER_SUB_ID
                                ORDER BY O.SEQUENCE_NO) AS NEXT_RESOURCE_ID
  FROM
    OPERATION O
  WHERE
    O.WORKORDER_TYPE='W'
    AND O.WORKORDER_BASE_ID='13000'
    AND O.WORKORDER_LOT_ID='50'
    AND O.WORKORDER_SPLIT_ID='0') O
ORDER BY
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO;

WORKORDER_BASE_ID    WOR WOR WOR SEQUENCE_NO RESOURCE_ID     NEXT_SEQUENCE_NO NEXT_RESOURCE_I
-------------------- --- --- --- ----------- --------------- ---------------- ---------------
13000                50  0   0           888 INSPECT                      999 SHIP
13000                50  0   0           999 SHIP
13000                50  0   1            10 92                            20 62
13000                50  0   1            20 62                           541 KRW
13000                50  0   1           541 KRW
13000                50  0   200          10 WSD70-TACK FIX                20 WELD-CBA
13000                50  0   200          20 WELD-CBA                      40 BLAST
13000                50  0   200          40 BLAST                         50 PAINT
13000                50  0   200          50 PAINT

Looks like a pretty easy solution… unless we recognize that after the last operation on sub-assembly 200 the next operation is really the parent operation of that sub-assembly (operation 10 on sub-assembly 1).  Likewise, after the last operation on sub-assembly 1, the next operation is really the parent operation of that sub-assembly (operation 888 on sub-assembly 0).  There is no next operation after operation 999 on the main sub-assembly 0.  How can we fix this problem with the NULL next operations in the previous output?  We just need an outer join to the dummy row in the REQUIREMENT table to pick up the parent’s sub-assembly number, and then join that dummy row back to a second reference of the OPERATION table.  The NVL2 and COALESCE functions are used to handle the cases where the original next operation and next resource ID would have been output as NULL values:

SELECT
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO,
  O.RESOURCE_ID,
  NVL2(O.NEXT_SEQUENCE_NO,O.WORKORDER_SUB_ID,O2.WORKORDER_SUB_ID) NEXT_SUB_ID,
  COALESCE(O.NEXT_SEQUENCE_NO,O2.SEQUENCE_NO) NEXT_SEQUENCE_NO,
  COALESCE(O.NEXT_RESOURCE_ID,O2.RESOURCE_ID) NEXT_RESOURCE_ID
FROM
  (SELECT
    O.WORKORDER_TYPE,
    O.WORKORDER_BASE_ID,
    O.WORKORDER_LOT_ID,
    O.WORKORDER_SPLIT_ID,
    O.WORKORDER_SUB_ID,
    O.SEQUENCE_NO,
    O.RESOURCE_ID,
    LEAD(O.SEQUENCE_NO,1) OVER (PARTITION BY O.WORKORDER_BASE_ID,
                                             O.WORKORDER_LOT_ID,
                                             O.WORKORDER_SPLIT_ID,
                                             O.WORKORDER_SUB_ID
                                ORDER BY O.SEQUENCE_NO) AS NEXT_SEQUENCE_NO,
    LEAD(O.RESOURCE_ID,1) OVER (PARTITION BY O.WORKORDER_BASE_ID,
                                             O.WORKORDER_LOT_ID,
                                             O.WORKORDER_SPLIT_ID,
                                             O.WORKORDER_SUB_ID
                                ORDER BY O.SEQUENCE_NO) AS NEXT_RESOURCE_ID
  FROM
    OPERATION O
  WHERE
    O.WORKORDER_TYPE='W'
    AND O.WORKORDER_BASE_ID='13000'
    AND O.WORKORDER_LOT_ID='50'
    AND O.WORKORDER_SPLIT_ID='0') O,
  REQUIREMENT R,
  OPERATION O2
WHERE
  O.WORKORDER_TYPE=R.WORKORDER_TYPE(+)
  AND O.WORKORDER_BASE_ID=R.WORKORDER_BASE_ID(+)
  AND O.WORKORDER_LOT_ID=R.WORKORDER_LOT_ID(+)
  AND O.WORKORDER_SPLIT_ID=R.WORKORDER_SPLIT_ID(+)
  AND O.WORKORDER_SUB_ID=R.SUBORD_WO_SUB_ID(+)
  AND R.WORKORDER_TYPE=O2.WORKORDER_TYPE(+)
  AND R.WORKORDER_BASE_ID=O2.WORKORDER_BASE_ID(+)
  AND R.WORKORDER_LOT_ID=O2.WORKORDER_LOT_ID(+)
  AND R.WORKORDER_SPLIT_ID=O2.WORKORDER_SPLIT_ID(+)
  AND R.WORKORDER_SUB_ID=O2.WORKORDER_SUB_ID(+)
  AND R.OPERATION_SEQ_NO=O2.SEQUENCE_NO(+)
ORDER BY
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO;

WORKORDER_BASE_ID    WOR WOR WOR SEQUENCE_NO RESOURCE_ID     NEX NEXT_SEQUENCE_NO NEXT_RESOURCE_I
-------------------- --- --- --- ----------- --------------- --- ---------------- ---------------
13000                50  0   0           888 INSPECT         0                999 SHIP
13000                50  0   0           999 SHIP
13000                50  0   1            10 92              1                 20 62
13000                50  0   1            20 62              1                541 KRW
13000                50  0   1           541 KRW             0                888 INSPECT
13000                50  0   200          10 WSD70-TACK FIX  200               20 WELD-CBA
13000                50  0   200          20 WELD-CBA        200               40 BLAST
13000                50  0   200          40 BLAST           200               50 PAINT
13000                50  0   200          50 PAINT           1                 10 92
 

In case you are wondering, the execution plan for the above SQL statement looks like this:

Plan hash value: 1263351280

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                 |      1 |        |      9 |00:00:00.01 |      43 |       |       |          |
|   1 |  SORT ORDER BY                   |                 |      1 |      9 |      9 |00:00:00.01 |      43 |  2048 |  2048 | 2048  (0)|
|   2 |   NESTED LOOPS OUTER             |                 |      1 |      9 |      9 |00:00:00.01 |      43 |       |       |          |
|   3 |    NESTED LOOPS OUTER            |                 |      1 |      9 |      9 |00:00:00.01 |      23 |       |       |          |
|   4 |     VIEW                         |                 |      1 |      9 |      9 |00:00:00.01 |       9 |       |       |          |
|   5 |      WINDOW BUFFER               |                 |      1 |      9 |      9 |00:00:00.01 |       9 |  2048 |  2048 | 2048  (0)|
|   6 |       TABLE ACCESS BY INDEX ROWID| OPERATION       |      1 |      9 |      9 |00:00:00.01 |       9 |       |       |          |
|*  7 |        INDEX RANGE SCAN          | SYS_C0021734    |      1 |      9 |      9 |00:00:00.01 |       3 |       |       |          |
|   8 |     TABLE ACCESS BY INDEX ROWID  | REQUIREMENT     |      9 |      1 |      7 |00:00:00.01 |      14 |       |       |          |
|*  9 |      INDEX RANGE SCAN            | X_REQUIREMENT_5 |      9 |      1 |      7 |00:00:00.01 |      13 |       |       |          |
|  10 |    TABLE ACCESS BY INDEX ROWID   | OPERATION       |      9 |      1 |      7 |00:00:00.01 |      20 |       |       |          |
|* 11 |     INDEX UNIQUE SCAN            | SYS_C0021734    |      9 |      1 |      7 |00:00:00.01 |      13 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("O"."WORKORDER_TYPE"='W' AND "O"."WORKORDER_BASE_ID"='13000' AND "O"."WORKORDER_LOT_ID"='50' AND
              "O"."WORKORDER_SPLIT_ID"='0')
   9 - access("O"."WORKORDER_TYPE"="R"."WORKORDER_TYPE" AND "O"."WORKORDER_BASE_ID"="R"."WORKORDER_BASE_ID" AND
              "O"."WORKORDER_LOT_ID"="R"."WORKORDER_LOT_ID" AND "O"."WORKORDER_SPLIT_ID"="R"."WORKORDER_SPLIT_ID" AND
              "O"."WORKORDER_SUB_ID"="R"."SUBORD_WO_SUB_ID")
       filter("R"."SUBORD_WO_SUB_ID" IS NOT NULL)
  11 - access("R"."WORKORDER_TYPE"="O2"."WORKORDER_TYPE" AND "R"."WORKORDER_BASE_ID"="O2"."WORKORDER_BASE_ID" AND
              "R"."WORKORDER_LOT_ID"="O2"."WORKORDER_LOT_ID" AND "R"."WORKORDER_SPLIT_ID"="O2"."WORKORDER_SPLIT_ID" AND
              "R"."WORKORDER_SUB_ID"="O2"."WORKORDER_SUB_ID" AND "R"."OPERATION_SEQ_NO"="O2"."SEQUENCE_NO")

Note
-----
   - cardinality feedback used for this statement 

The above takes less that 0.01 seconds to execute, in part because I was careful to preserve the leading columns in the primary key indexes’ columns when building the inline view.

But, what if the OP (as he mentioned) was running with an older version of SQL Server that did not support the LEAD analytic function?  Is the OP out of luck (it could be argued that is a trick question)?

Let’s start again, this time we will self-join the OPERATION table rather than using an analytic function to determine the next operation, and just for fun we will use ANSI join syntax:

SELECT
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO,
  MIN(O2.SEQUENCE_NO) AS NEXT_SEQUENCE_NO
FROM
  OPERATION O
LEFT OUTER JOIN
  OPERATION O2
ON
  O.WORKORDER_TYPE=O2.WORKORDER_TYPE
  AND O.WORKORDER_BASE_ID=O2.WORKORDER_BASE_ID
  AND O.WORKORDER_LOT_ID=O2.WORKORDER_LOT_ID
  AND O.WORKORDER_SPLIT_ID=O2.WORKORDER_SPLIT_ID
  AND O.WORKORDER_SUB_ID=O2.WORKORDER_SUB_ID
  AND O.SEQUENCE_NO < O2.SEQUENCE_NO
WHERE
  O.WORKORDER_TYPE='W'
  AND O.WORKORDER_BASE_ID='13000'
  AND O.WORKORDER_LOT_ID='50'
  AND O.WORKORDER_SPLIT_ID='0'
GROUP BY
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO
ORDER BY
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO;

WORKORDER_BASE_ID    WOR WOR WOR SEQUENCE_NO NEXT_SEQUENCE_NO
-------------------- --- --- --- ----------- ----------------
13000                50  0   0           888              999
13000                50  0   0           999
13000                50  0   1            10               20
13000                50  0   1            20              541
13000                50  0   1           541
13000                50  0   200          10               20
13000                50  0   200          20               40
13000                50  0   200          40               50
13000                50  0   200          50 

Now, we will slide the above SQL statement into an inline view and use the COALESCE function to return the first non-NULL value in a list of columns, and the Oracle only NVL2 function to return one of two column values depending on whether or not a third column contains a NULL value:

SELECT
  O.WORKORDER_TYPE,
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO,
  O.RESOURCE_ID,
  NVL2(O.NEXT_SEQUENCE_NO,O.WORKORDER_SUB_ID,R.WORKORDER_SUB_ID) AS NEXT_SUB_ID,
  COALESCE(O.NEXT_SEQUENCE_NO,R.OPERATION_SEQ_NO) AS NEXT_SEQUENCE_NO
FROM
  (SELECT
    O.WORKORDER_TYPE,
    O.WORKORDER_BASE_ID,
    O.WORKORDER_LOT_ID,
    O.WORKORDER_SPLIT_ID,
    O.WORKORDER_SUB_ID,
    O.SEQUENCE_NO,
    O.RESOURCE_ID,
    MIN(O2.SEQUENCE_NO) AS NEXT_SEQUENCE_NO
  FROM
    OPERATION O
  LEFT OUTER JOIN
    OPERATION O2
  ON
    O.WORKORDER_TYPE=O2.WORKORDER_TYPE
    AND O.WORKORDER_BASE_ID=O2.WORKORDER_BASE_ID
    AND O.WORKORDER_LOT_ID=O2.WORKORDER_LOT_ID
    AND O.WORKORDER_SPLIT_ID=O2.WORKORDER_SPLIT_ID
    AND O.WORKORDER_SUB_ID=O2.WORKORDER_SUB_ID
    AND O.SEQUENCE_NO < O2.SEQUENCE_NO
  WHERE
    O.WORKORDER_TYPE='W'
    AND O.WORKORDER_BASE_ID='13000'
    AND O.WORKORDER_LOT_ID='50'
    AND O.WORKORDER_SPLIT_ID='0'
  GROUP BY
    O.WORKORDER_TYPE,
    O.WORKORDER_BASE_ID,
    O.WORKORDER_LOT_ID,
    O.WORKORDER_SPLIT_ID,
    O.WORKORDER_SUB_ID,
    O.SEQUENCE_NO,
    O.RESOURCE_ID) O
LEFT OUTER JOIN
  REQUIREMENT R
ON
  O.WORKORDER_TYPE=R.WORKORDER_TYPE
  AND O.WORKORDER_BASE_ID=R.WORKORDER_BASE_ID
  AND O.WORKORDER_LOT_ID=R.WORKORDER_LOT_ID
  AND O.WORKORDER_SPLIT_ID=R.WORKORDER_SPLIT_ID
  AND O.WORKORDER_SUB_ID=R.SUBORD_WO_SUB_ID
ORDER BY
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO;

W WORKORDER_BASE_ID    WOR WOR WOR SEQUENCE_NO RESOURCE_ID     NEX NEXT_SEQUENCE_NO
- -------------------- --- --- --- ----------- --------------- --- ----------------
W 13000                50  0   0           888 INSPECT         0                999
W 13000                50  0   0           999 SHIP
W 13000                50  0   1            10 92              1                 20
W 13000                50  0   1            20 62              1                541
W 13000                50  0   1           541 KRW             0                888
W 13000                50  0   200          10 WSD70-TACK FIX  200               20
W 13000                50  0   200          20 WELD-CBA        200               40
W 13000                50  0   200          40 BLAST           200               50
W 13000                50  0   200          50 PAINT           1                 10 

Almost there – we still need the resource ID of the next operation.  Sliding the above into another level of inline view, we end up with the following:

SELECT
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.RESOURCE_ID,
  O.NEXT_SUB_ID,
  O.NEXT_SEQUENCE_NO,
  O2.RESOURCE_ID NEXT_RESOURCE_ID
FROM
  (SELECT
    O.WORKORDER_TYPE,
    O.WORKORDER_BASE_ID,
    O.WORKORDER_LOT_ID,
    O.WORKORDER_SPLIT_ID,
    O.WORKORDER_SUB_ID,
    O.SEQUENCE_NO,
    O.RESOURCE_ID,
    NVL2(O.NEXT_SEQUENCE_NO,O.WORKORDER_SUB_ID,R.WORKORDER_SUB_ID) AS NEXT_SUB_ID,
    COALESCE(O.NEXT_SEQUENCE_NO,R.OPERATION_SEQ_NO) AS NEXT_SEQUENCE_NO
  FROM
    (SELECT
      O.WORKORDER_TYPE,
      O.WORKORDER_BASE_ID,
      O.WORKORDER_LOT_ID,
      O.WORKORDER_SPLIT_ID,
      O.WORKORDER_SUB_ID,
      O.SEQUENCE_NO,
      O.RESOURCE_ID,
      MIN(O2.SEQUENCE_NO) AS NEXT_SEQUENCE_NO
    FROM
      OPERATION O
    LEFT OUTER JOIN
      OPERATION O2
    ON
      O.WORKORDER_TYPE=O2.WORKORDER_TYPE
      AND O.WORKORDER_BASE_ID=O2.WORKORDER_BASE_ID
      AND O.WORKORDER_LOT_ID=O2.WORKORDER_LOT_ID
      AND O.WORKORDER_SPLIT_ID=O2.WORKORDER_SPLIT_ID
      AND O.WORKORDER_SUB_ID=O2.WORKORDER_SUB_ID
      AND O.SEQUENCE_NO < O2.SEQUENCE_NO
    WHERE
      O.WORKORDER_TYPE='W'
      AND O.WORKORDER_BASE_ID='13000'
      AND O.WORKORDER_LOT_ID='50'
      AND O.WORKORDER_SPLIT_ID='0'
    GROUP BY
      O.WORKORDER_TYPE,
      O.WORKORDER_BASE_ID,
      O.WORKORDER_LOT_ID,
      O.WORKORDER_SPLIT_ID,
      O.WORKORDER_SUB_ID,
      O.SEQUENCE_NO,
      O.RESOURCE_ID) O
  LEFT OUTER JOIN
    REQUIREMENT R
  ON
    O.WORKORDER_TYPE=R.WORKORDER_TYPE
    AND O.WORKORDER_BASE_ID=R.WORKORDER_BASE_ID
    AND O.WORKORDER_LOT_ID=R.WORKORDER_LOT_ID
    AND O.WORKORDER_SPLIT_ID=R.WORKORDER_SPLIT_ID
    AND O.WORKORDER_SUB_ID=R.SUBORD_WO_SUB_ID) O
LEFT OUTER JOIN
  OPERATION O2
ON
  O.WORKORDER_TYPE=O2.WORKORDER_TYPE
  AND O.WORKORDER_BASE_ID=O2.WORKORDER_BASE_ID
  AND O.WORKORDER_LOT_ID=O2.WORKORDER_LOT_ID
  AND O.WORKORDER_SPLIT_ID=O2.WORKORDER_SPLIT_ID
  AND O.NEXT_SUB_ID=O2.WORKORDER_SUB_ID
  AND O.NEXT_SEQUENCE_NO=O2.SEQUENCE_NO
ORDER BY
  O.WORKORDER_BASE_ID,
  O.WORKORDER_LOT_ID,
  O.WORKORDER_SPLIT_ID,
  O.WORKORDER_SUB_ID,
  O.SEQUENCE_NO;

WORKORDER_BASE_ID    WOR WOR WOR RESOURCE_ID     NEX NEXT_SEQUENCE_NO NEXT_RESOURCE_I
-------------------- --- --- --- --------------- --- ---------------- ---------------
13000                50  0   0   INSPECT         0                999 SHIP
13000                50  0   0   SHIP
13000                50  0   1   92              1                 20 62
13000                50  0   1   62              1                541 KRW
13000                50  0   1   KRW             0                888 INSPECT
13000                50  0   200 WSD70-TACK FIX  200               20 WELD-CBA
13000                50  0   200 WELD-CBA        200               40 BLAST
13000                50  0   200 BLAST           200               50 PAINT
13000                50  0   200 PAINT           1                 10 92 

The above SQL statement has the following execution plan:

Plan hash value: 3374377097

---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |              |      1 |        |      9 |00:00:00.01 |      41 |       |       |          |
|   1 |  SORT ORDER BY                    |              |      1 |      9 |      9 |00:00:00.01 |      41 |  2048 |  2048 | 2048  (0)|
|   2 |   NESTED LOOPS OUTER              |              |      1 |      9 |      9 |00:00:00.01 |      41 |       |       |          |
|   3 |    VIEW                           |              |      1 |      9 |      9 |00:00:00.01 |      19 |       |       |          |
|   4 |     HASH GROUP BY                 |              |      1 |      9 |      9 |00:00:00.01 |      19 |   781K|   781K| 1103K (0)|
|*  5 |      HASH JOIN OUTER              |              |      1 |      9 |     13 |00:00:00.01 |      19 |   851K|   851K|  653K (0)|
|*  6 |       HASH JOIN OUTER             |              |      1 |      9 |      9 |00:00:00.01 |      16 |   927K|   927K|  651K (0)|
|   7 |        TABLE ACCESS BY INDEX ROWID| OPERATION    |      1 |      9 |      9 |00:00:00.01 |       9 |       |       |          |
|*  8 |         INDEX RANGE SCAN          | SYS_C0021734 |      1 |      9 |      9 |00:00:00.01 |       3 |       |       |          |
|*  9 |        TABLE ACCESS BY INDEX ROWID| REQUIREMENT  |      1 |      1 |      2 |00:00:00.01 |       7 |       |       |          |
|* 10 |         INDEX RANGE SCAN          | SYS_C0021842 |      1 |      1 |     18 |00:00:00.01 |       3 |       |       |          |
|* 11 |       INDEX RANGE SCAN            | SYS_C0021734 |      1 |      1 |      9 |00:00:00.01 |       3 |       |       |          |
|  12 |    TABLE ACCESS BY INDEX ROWID    | OPERATION    |      9 |      1 |      8 |00:00:00.01 |      22 |       |       |          |
|* 13 |     INDEX UNIQUE SCAN             | SYS_C0021734 |      9 |      1 |      8 |00:00:00.01 |      14 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("O"."WORKORDER_SUB_ID"="O2"."WORKORDER_SUB_ID" AND "O"."WORKORDER_SPLIT_ID"="O2"."WORKORDER_SPLIT_ID" AND
              "O"."WORKORDER_LOT_ID"="O2"."WORKORDER_LOT_ID" AND "O"."WORKORDER_BASE_ID"="O2"."WORKORDER_BASE_ID" AND
              "O"."WORKORDER_TYPE"="O2"."WORKORDER_TYPE")
       filter("O"."SEQUENCE_NO"<"O2"."SEQUENCE_NO")
   6 - access("O"."WORKORDER_SUB_ID"="R"."SUBORD_WO_SUB_ID" AND "O"."WORKORDER_SPLIT_ID"="R"."WORKORDER_SPLIT_ID" AND
              "O"."WORKORDER_LOT_ID"="R"."WORKORDER_LOT_ID" AND "O"."WORKORDER_BASE_ID"="R"."WORKORDER_BASE_ID" AND
              "O"."WORKORDER_TYPE"="R"."WORKORDER_TYPE")
   8 - access("O"."WORKORDER_TYPE"='W' AND "O"."WORKORDER_BASE_ID"='13000' AND "O"."WORKORDER_LOT_ID"='50' AND
              "O"."WORKORDER_SPLIT_ID"='0')
   9 - filter("R"."SUBORD_WO_SUB_ID" IS NOT NULL)
  10 - access("R"."WORKORDER_TYPE"='W' AND "R"."WORKORDER_BASE_ID"='13000' AND "R"."WORKORDER_LOT_ID"='50' AND
              "R"."WORKORDER_SPLIT_ID"='0')
  11 - access("O2"."WORKORDER_TYPE"='W' AND "O2"."WORKORDER_BASE_ID"='13000' AND "O2"."WORKORDER_LOT_ID"='50' AND
              "O2"."WORKORDER_SPLIT_ID"='0')
  13 - access("O"."WORKORDER_TYPE"="O2"."WORKORDER_TYPE" AND "O"."WORKORDER_BASE_ID"="O2"."WORKORDER_BASE_ID" AND
              "O"."WORKORDER_LOT_ID"="O2"."WORKORDER_LOT_ID" AND "O"."WORKORDER_SPLIT_ID"="O2"."WORKORDER_SPLIT_ID" AND
              "O"."NEXT_SUB_ID"="O2"."WORKORDER_SUB_ID" AND "O"."NEXT_SEQUENCE_NO"="O2"."SEQUENCE_NO")

Note
-----
   - cardinality feedback used for this statement 

So, now that we have two different SQL statements that both solve the problem that was presented by the OP, which version of the SQL statement is the most efficient, and which would you use in production?





Simple SQL with and without Inline Views 2

26 10 2011

October 26, 2011

(Back to the Previous Post in the Series)

In the previous article of this series we examined a couple of moderately simple data retrieval requirements, and used inline views and/or analytic functions to solve those data retrieval requirements.  I recently saw another opportunity to help a person on the ERP mailing list with a SQL statement.  The original poster (OP) provided a screen capture that contained information similar to the following:

CUST_ORDER_ID   PART_ID         TYPE CLASS WAREHOUSE_ID LOCATION_ID  QTY_ON_HAND TRANSACTI
--------------- --------------- ---- ----- ------------ ------------ ----------- ---------
ZQHZG           2J-2908         O    I     FIB          AB7A10                30 21-JUN-10
ZQHZZ           2J-2909         O    I     FIB          AB7A10                 1 21-JUN-10
ZIHKR           2J0836          O    I     HIBB         A77Y3               1092 30-NOV-07
ZQKZH           2J0836          O    I     HIBB         A77Y3               1092 10-JUN-08
ZQIHZ           2J0836          O    I     HIBB         A77Y3               1092 23-NOV-09
ZKOHI           2J4120          O    I     PIPABTONA    AB7A10                 2 16-NOV-95
ZKOHQ           2J4129          O    I     PIPABTONA    AB7A10                 2 16-NOV-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 25-SEP-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 25-SEP-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 26-OCT-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 26-OCT-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 12-DEC-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 12-DEC-95
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 27-FEB-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 04-APR-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 04-APR-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 31-MAY-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 31-MAY-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 06-SEP-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 30-SEP-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 30-SEP-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 30-SEP-96
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 12-JAN-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 12-FEB-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 04-MAR-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 05-MAY-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 05-MAY-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 18-AUG-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 18-AUG-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 29-SEP-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785          O    I     HIBB         AB7A10                 2 02-DEC-98
ZIRZQ           3J8547          O    I     HIBB         A77Y2                252 18-OCT-07
ZZUQG           4G0007          O    I     HIBB         AB7A10                 2 08-NOV-99
ZZUQG           4G0007          O    I     HIBB         AB7A10                 2 30-NOV-99 
...

Initially, the OP requested that the most recent TRANSACTION_DATE (the last column in the above output) for the PART_ID value to be shown on each row of the output, rather than the TRANSACTION_DATE that is directly associated with a specific row.  However, there are two problems:

  • The OP did not mention what tables were used as the data source for the included screen capture.
  • The OP did not mention whether a recent release of Oracle Database or some other brand of database was in use.

For times like these, it helps a lot to be familiar with the data model used in the database.  The INVENTORY_TRANS table in this particular ERP system has columns that match several of the columns that were included in the OP’s screen capture: CUST_ORDER_ID, PART_ID, TYPE, CLASS, WAREHOUSE_ID, LOCATION_ID, and TRANSACTION_DATE.  The question remains, where did the QTY_ON_HAND column come from – that column is not in the INVENTORY_TRANS table.  We could find out, at least on Oracle Database, with a SQL statement similar to the following:

SELECT
  DTC.TABLE_NAME,
  DT.TABLE_NAME
FROM
  DBA_TAB_COLUMNS DTC,
  DBA_TABLES DT
WHERE
  DTC.COLUMN_NAME='QTY_ON_HAND'
  AND DTC.OWNER='SCHEMA_OWNER_HERE'
  AND DTC.OWNER=DT.OWNER(+)
  AND DTC.TABLE_NAME=DT.TABLE_NAME(+)
ORDER BY
  DTC.TABLE_NAME;

TABLE_NAME                     TABLE_NAME
------------------------------ ----------
CR_EC_LINE_PART
CR_OM_CO_LINE_PART
CR_PART_PO_LINE
CR_WO_PART
PART                           PART
PART_USER_DEFINED_VIEW 

Note that in the above, I had to join the DBA_TAB_COLUMNS dictionary view with the DBA_TABLES dictionary view (I could have used DBA_OBJECTS instead) to determine that 5 of the 6 rows returned from the DBA_TAB_COLUMNS view are in fact not tables, but views.  Thus, the QTY_ON_HAND column in the OP’s screen capture must have originated from the PART table (unless, of course, the column in the screen capture is actually an aliased calculation).

Now that the data sources are known for the query, we can take a best guess approach at reproducing the output that the OP provided in the screen capture:

SELECT
  IT.CUST_ORDER_ID,
  IT.PART_ID,
  TYPE,
  CLASS,
  IT.WAREHOUSE_ID,
  IT.LOCATION_ID,
  P.QTY_ON_HAND,
  IT.TRANSACTION_DATE
FROM
  INVENTORY_TRANS IT,
  PART P
WHERE
  IT.CUST_ORDER_ID IS NOT NULL
  AND IT.TYPE='O'
  AND IT.CLASS='I'
  AND IT.PART_ID=P.ID
  AND P.QTY_ON_HAND>0
ORDER BY
  IT.PART_ID,
  IT.TRANSACTION_DATE; 

Next, there are a couple of ways to have the most recent transaction date appear on each row of the output (partially dependent on the database version).  First, let’s build a query that returns the most recent transaction date where the TYPE is O and the CLASS is I and the CUST_ORDER_ID column does not contain a NULL:

SELECT
  PART_ID,
  MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS
WHERE
  TYPE='O'
  AND CLASS='I'
  AND CUST_ORDER_ID IS NOT NULL
GROUP BY
  PART_ID
ORDER BY
  PART_ID; 

PART_ID         LAST_TRAN
--------------- ---------
0000009         29-DEC-00
005035008005    29-MAY-98
005035008006    29-MAY-98
00576649-0080   20-OCT-11
00576649-0088   08-JUN-11
00576649-8005   14-JAN-11
007580398       17-JAN-03
0099229940      01-NOV-95
0099229990      01-NOV-95
0108556         28-APR-00
01550867KM      15-NOV-05
01552316KM      02-OCT-00
01552346KM      30-APR-03
01552369KM      30-APR-01
01558943M       07-JAN-10
01561230KM      30-OCT-01
01563001M       08-MAR-10
01563882M       01-APR-10
01566790KM      08-SEP-08
01569945M       01-APR-10
01583508TO      11-FEB-05
01780151TO      19-JUL-07
...

Next, we will just plug the above inline view into the FROM clause of our original SQL statement and add an additional predicate into the WHERE clause to tell the query optimizer how to associate the inline view with the other tables in the SQL statement:

SELECT
  IT.CUST_ORDER_ID,
  IT.PART_ID,
  IT.QTY,
  IT.WAREHOUSE_ID,
  IT.LOCATION_ID,
  P.QTY_ON_HAND,
  MIT.LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS IT,
  PART P,
  (SELECT
     PART_ID,
     MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
   FROM
     INVENTORY_TRANS
   WHERE
     TYPE='O'
     AND CLASS='I'
     AND CUST_ORDER_ID IS NOT NULL
   GROUP BY
     PART_ID) MIT
WHERE
  IT.CUST_ORDER_ID IS NOT NULL
  AND IT.TYPE='O'
  AND IT.CLASS='I'
  AND IT.PART_ID=P.ID
  AND P.QTY_ON_HAND>0
  AND IT.PART_ID=MIT.PART_ID
ORDER BY
  IT.PART_ID,
  IT.CUST_ORDER_ID,
  IT.TRANSACTION_DATE; 

CUST_ORDER_ID   PART_ID             QTY WAREHOUSE_ID LOCATION_ID  QTY_ON_HAND LAST_TRAN
--------------- ------------ ---------- ------------ ------------ ----------- ---------
ZQHZG           2J-2908               1 FIB          AB7A10                30 21-JUN-10
ZQHZZ           2J-2909               1 FIB          AB7A10                 1 21-JUN-10
ZIHKR           2J0836               48 HIBB         A77Y3               1092 23-NOV-09
ZQIHZ           2J0836               72 HIBB         A77Y3               1092 23-NOV-09
ZQKZH           2J0836              111 HIBB         A77Y3               1092 23-NOV-09
ZKOHI           2J4120                2 PIPABTONA    AB7A10                 2 16-NOV-95
ZKOHQ           2J4129                2 PIPABTONA    AB7A10                 2 16-NOV-95
ZKGQG           2J9785                9 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                1 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                7 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                2 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                6 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                4 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               10 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                8 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                4 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               10 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                4 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               12 HIBB         AB7A10                 2 02-DEC-98
...

Since I am using Oracle Database 11.2.0.2, I can take advantage of analytic functions to further simplify the SQL statement:

SELECT
  IT.CUST_ORDER_ID,
  IT.PART_ID,
  IT.QTY,
  IT.WAREHOUSE_ID,
  IT.LOCATION_ID,
  P.QTY_ON_HAND,
  MAX(IT.TRANSACTION_DATE) OVER (PARTITION BY IT.PART_ID) AS LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS IT,
  PART P
WHERE
  IT.CUST_ORDER_ID IS NOT NULL
  AND IT.TYPE='O'
  AND IT.CLASS='I'
  AND IT.PART_ID=P.ID
  AND P.QTY_ON_HAND>0
ORDER BY
  IT.PART_ID,
  IT.CUST_ORDER_ID,
  IT.TRANSACTION_DATE; 

CUST_ORDER_ID   PART_ID             QTY WAREHOUSE_ID LOCATION_ID  QTY_ON_HAND LAST_TRAN
--------------- ------------ ---------- ------------ ------------ ----------- ---------
ZQHZG           2J-2908               1 FIB          AB7A10                30 21-JUN-10
ZQHZZ           2J-2909               1 FIB          AB7A10                 1 21-JUN-10
ZIHKR           2J0836               48 HIBB         A77Y3               1092 23-NOV-09
ZQKZH           2J0836              111 HIBB         A77Y3               1092 23-NOV-09
ZQIHZ           2J0836               72 HIBB         A77Y3               1092 23-NOV-09
ZKOHI           2J4120                2 PIPABTONA    AB7A10                 2 16-NOV-95
ZKOHQ           2J4129                2 PIPABTONA    AB7A10                 2 16-NOV-95
ZKGQG           2J9785                9 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                1 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                7 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                2 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                6 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                4 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               10 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                8 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                4 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               10 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                4 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               12 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785                2 HIBB         AB7A10                 2 02-DEC-98
ZKGQG           2J9785               11 HIBB         AB7A10                 2 02-DEC-98
...

—–

Let’s assume that the OP’s intended result changes – the OP just wants to know which PART_IDs have not had any transactions in the last 89 day.  So, the OP is interested in those PART_IDs with a last transaction that was either 90 or more days ago, or there has never been a transaction for the PART_ID.  Oh, scrap… start over.

First, we will identify the list of PART_IDs that have had transactions in the last 89 days (CLASS values of ‘A’ are adjustments, not true inventory transactions, so rows with those CLASS values will be excluded).  Once we have that list, we can simply subtract the list from the entire list of PART_IDs:

SELECT DISTINCT
  PART_ID
FROM
  INVENTORY_TRANS
WHERE
  TRANSACTION_DATE > TRUNC(SYSDATE-90)
  AND PART_ID IS NOT NULL
  AND CLASS<>'A'; 

PART_ID
---------
2J-2908
9F4810
867-8649
90772357L
90772341L
90772297L
8475047
8145432
90772326L
9340730
90772357
90772347
782-9484
...

The OP is interested in seeing the actual warehouse location containing the PART_IDs, so we need to use the PART_LOCATION table, not the PART table as we did before.  We then need to create a left outer join between the above list and the PART_LOCATION table – essentially, we will find cases where there is a row in the PART_LOCATION table without a corresponding part ID in the above list.  ANSI syntax will be used here for the outer join, because it is still unknown whether the OP is running an Oracle Database backend or a SQL Server backend:

SELECT
  PL.PART_ID,
  PL.WAREHOUSE_ID,
  PL.LOCATION_ID,
  IT.PART_ID,
  PL.QTY
FROM
  PART_LOCATION PL
LEFT OUTER JOIN
  (SELECT DISTINCT
     PART_ID
   FROM
     INVENTORY_TRANS
   WHERE
     TRANSACTION_DATE > TRUNC(SYSDATE-90)
     AND PART_ID IS NOT NULL
     AND CLASS<>'A') IT
ON
  PL.PART_ID=IT.PART_ID
  AND IT.PART_ID IS NULL
WHERE
  PL.QTY<>0
ORDER BY
  PL.PART_ID,
  PL.WAREHOUSE_ID,
  PL.LOCATION_ID; 

PART_ID                WAREHOUSE_ID LOCATION_ID  PART_ID             QTY
---------------------- ------------ ------------ ------------ ----------
#2                     HIBB         AB7A10                             4
#46954                 HIBB         TOOL CRAB                         15
#4HWG LHRK GROUND WIRK HIBB         TOOL CRAB                        250
#4THHN                 HIBB         AB7A10                          2460
#5                     HIBB         AB7A10                             3
#6                     HIBB         TOOL CRAB                          4
#649                   HIBB         TOOL CRAB                          2
#655                   HIBB         TOOL CRAB                          4
#709                   HIBB         TOOL CRAB                          2
#889                   HIBB         TOOL CRAB                          1
#89 KJTKNSION          HIBB         TOOL CRAB                          3
#90                    HIBB         AB7A10                             4
#92 KJTKNSION          HIBB         TOOL CRAB                          2
*G-WM370H-MT           HIBB         AB7A10                             2
.235-H522              BURN         AB7A10                            .5
.235RD-9K0045          BURN         AB7A10                            .7
.260RDJ.963-HSTMH57    BURN         AB7A10                        2.5625
.35-H596-845           PIPABTONA    AB7A10                         4.039
...

In the above, notice that there is a seeminly odd ON clause: “PL.PART_ID=IT.PART_ID  AND IT.PART_ID IS NULL” – that simply means that when resolving the left outer join, a value must be found on the left side, but a corresponding value should not be found on the right side of the join.  This is the “minus” syntax that I selected to use here (the SQL MINUS syntax could not be used because the WAREHOUSE_ID, LOCATION_ID, and QTY column values returned by the query of the INVENTORY_TRANS table would have to exactly match the values found in the PART_LOCATION table… and that is more difficult to accomplish).

If I was certain that the OP was running with an Oracle Database backend, I would have suggested the following syntax, rather than the ANSI syntax as shown above:

SELECT
  PL.PART_ID,
  PL.WAREHOUSE_ID,
  PL.LOCATION_ID,
  IT.PART_ID,
  PL.QTY
FROM
  PART_LOCATION PL,
  (SELECT DISTINCT
     PART_ID
   FROM
     INVENTORY_TRANS
   WHERE
     TRANSACTION_DATE > TRUNC(SYSDATE-90)
     AND PART_ID IS NOT NULL
     AND CLASS<>'A') IT
WHERE
  PL.PART_ID=IT.PART_ID(+)
  AND IT.PART_ID IS NULL
  AND PL.QTY<>0
ORDER BY
  PL.PART_ID,
  PL.WAREHOUSE_ID,
  PL.LOCATION_ID; 

Why the NULL values in the fourth column (the second PART_ID column) of the output?  That NULL simply means that the query is working exactly as intended.  That fourth column would be showing only those PART_IDs that have had transactions in the last 89 days (and NULL if no transactions in the last 89 days) if the restriction “AND IT.PART_ID IS NULL” was NOT included in the ON clause – but such rows should not be included in the output if the output is to be used as intended.





Simple SQL with and without Inline Views

21 10 2011

October 21, 2011

(Forward to the Next Post in the Series)

Sometimes it is interesting to take a step back from some of the more difficult to construct SQL statements, and just put together something simple.  The following request recently arrived through an ERP mailing list:

“I want to find the last ship date for all the open part numbers in Customer Order Entry. The last ship date is not necessarily from the Open order in Customer Order Entry…”

The original poster (OP) provided more information, and even provided a description of what was already tried as a solution.  It is not exceptionally clear what the OP wanted, but because I have several years of experience working with this particular ERP system, I had a feeling that the OP might be headed in the wrong direction.  This ERP system uses tables that are well normalized, which would make it a bit more difficult to locate the most recent ship date for a part number that is recorded with one customer order’s tables’ rows when another customer order’s tables’ rows are reported.  Unless, of course, we remember that there is also a transaction table that records every inventory movement of part numbers through the system – we just need to know what those rows in the transaction table look like.  In this case, the rows have the following characteristics:

  • The CUST_ORDER_ID column value must not be NULL
  • The TYPE column must have a value of ‘O’
  • The CLASS column must have a value of ‘I’

With that knowledge, we could then find the most recent date that any part number shipped with a SQL statement similar to the following:

SELECT
  PART_ID,
  MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS
WHERE
  CUST_ORDER_ID IS NOT NULL
  AND TYPE='O'
  AND CLASS='I'
GROUP BY
  PART_ID;

In the above, the AS keyword is optional on an Oracle Database platform, but it is not optional on some other database platforms.

With an Oracle Database backend, inline views may be used to permit the above SQL statement to be used to retrieve additional information for a parent SQL statement.  If we wrap the above SQL statement in ( ) and place it in the FROM clause of the SQL statement, Oracle will treat the data returned by the above SQL statement quite similar to how it would handle a regular table (the Oracle query optimizer might decide to rewrite the combined SQL statement into an equivalent form that no longer contains an inline view).  There is always a chance that a part number that is on order may have never shipped to a customer, so we will need to outer join (in this case using a left outer join) to the newly created inline view:

SELECT
  COL.CUST_ORDER_ID,
  COL.PART_ID,
  LS.LAST_TRANSACTION_DATE
FROM
  CUSTOMER_ORDER CO,
  CUST_ORDER_LINE COL,
  (SELECT
    PART_ID,
    MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
  FROM
    INVENTORY_TRANS
  WHERE
    CUST_ORDER_ID IS NOT NULL
    AND TYPE='O'
    AND CLASS='I'
  GROUP BY
    PART_ID) LS
WHERE
  CO.ID=COL.CUST_ORDER_ID
  AND CO.STATUS IN ('U','F','R')
  AND COL.LINE_STATUS='A'
  AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY
  AND COL.PART_ID=LS.PART_ID(+);

In the above, I joined the CUSTOMER_ORDER and CUST_ORDER_LINE tables to the inline view that I aliased as LS.  A third table, CUST_LINE_DEL, that optionally contains the delivery schedule for some of the rows in the CUST_ORDER_LINE table, could have also been outer joined to the CUST_ORDER_LINE table.

Running SQL Server, or just feeling ANSI?  If so, the above may be rewritten as follows:

SELECT
  COL.CUST_ORDER_ID,
  COL.PART_ID,
  LS.LAST_TRANSACTION_DATE
FROM
  CUSTOMER_ORDER CO
JOIN
  CUST_ORDER_LINE COL
ON
  CO.ID=COL.CUST_ORDER_ID
LEFT OUTER JOIN
  (SELECT
    PART_ID,
    MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
  FROM
    INVENTORY_TRANS
  WHERE
    CUST_ORDER_ID IS NOT NULL
    AND TYPE='O'
    AND CLASS='I'
  GROUP BY
    PART_ID) LS
ON
  COL.PART_ID=LS.PART_ID
WHERE
  CO.STATUS IN ('U','F','R')
  AND COL.LINE_STATUS='A'
  AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY;

Unfortunately, the OP is actually using a SQLBase database backend that does not support inline views.  I remember the feeling before I discovered that Oracle Database supported inline views… in that case I would do something like the following:

  1. Create a statically defined view.
  2. Join to that statically defined view just as if the view were a table.
CREATE VIEW CUST_ORDER_PART_LS AS
SELECT
  PART_ID,
  MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
  INVENTORY_TRANS
WHERE
  CUST_ORDER_ID IS NOT NULL
  AND TYPE='O'
  AND CLASS='I'
GROUP BY
  PART_ID;

SELECT
  COL.CUST_ORDER_ID,
  COL.PART_ID,
  LS.LAST_TRANSACTION_DATE
FROM
  CUSTOMER_ORDER CO,
  CUST_ORDER_LINE COL,
  CUST_ORDER_PART_LS LS
WHERE
  CO.ID=COL.CUST_ORDER_ID
  AND CO.STATUS IN ('U','F','R')
  AND COL.LINE_STATUS='A'
  AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY
  AND COL.PART_ID=LS.PART_ID(+);

I guess that it is a good exercise once in a while to practice simple SQL.

A second example from the ERP mailing list showed a SQL Server solution for a particular problem.  The particular problem that the SQL Server solution set out to solve is as follows:

“Someone posed an interesting question to me – How can you do a mass insert of Operations? Well most of us know how to manage that without too much trouble but this one came with a couple of caveats – The Master’s first operation (lowest numbered) has to have a particular ResourceID and then a different but specific ResourceID must be on a subsequent Operation. It is after the second operation where the insert must be placed.

Sounds like fun – So I figured that it could be done in a single SQL statement – well after a 4-way join of the Operation table to itself I got it to work how I expected…”

Interesting, I think that there are times in the past that I have used multiple self-joins to a table in order to solve similar SQL problems.  However, there is an easier way using analytic functions.  The following SQL statement attempts to indicate: the sequence of the operation within the work order sub ID, the sequence number that follows sequentially, the second sequence number that follows sequentially, and the resource ID of the next operation in sequential order by the operation number:

SELECT
  WORKORDER_BASE_ID,
  WORKORDER_LOT_ID,
  WORKORDER_SPLIT_ID,
  WORKORDER_SUB_ID,
  SEQUENCE_NO,
  RESOURCE_ID,
  ROW_NUMBER() OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                     ORDER BY RESOURCE_ID) AS RN,
  LEAD(RESOURCE_ID,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                            ORDER BY RESOURCE_ID) AS NEXT_RESOURCE_ID,
  LEAD(SEQUENCE_NO,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                            ORDER BY RESOURCE_ID) AS NEXT_SEQ_NO,
  LEAD(SEQUENCE_NO,2) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                            ORDER BY RESOURCE_ID) AS NEXT_NEXT_SEQ_NO
FROM
  OPERATION
WHERE
  WORKORDER_TYPE='M'; 

Next, we only care about those rows when our resource ID of interest (69 in my example) is specified in the first operation on the work order sub ID and the second operation on that work order sub ID specifies the other resource ID that is of interest (FW in my example).  If we are to avoid primary key violations, we should select a new sequence number that is half way between the next-next operation and the next operation:

SELECT
  WORKORDER_BASE_ID,
  WORKORDER_LOT_ID,
  WORKORDER_SPLIT_ID,
  WORKORDER_SUB_ID,
  SEQUENCE_NO,
  NEXT_SEQ_NO,
  NEXT_NEXT_SEQ_NO,
  NEXT_SEQ_NO + ROUND((COALESCE(NEXT_NEXT_SEQ_NO,NEXT_SEQ_NO+10) - NEXT_SEQ_NO)/2) NEW_SEQUENCE_NO,
  'NEW RESOURCE' RESOURCE_ID
FROM
  (SELECT
    WORKORDER_BASE_ID,
    WORKORDER_LOT_ID,
    WORKORDER_SPLIT_ID,
    WORKORDER_SUB_ID,
    SEQUENCE_NO,
    RESOURCE_ID,
    ROW_NUMBER() OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                       ORDER BY RESOURCE_ID) AS RN,
    LEAD(RESOURCE_ID,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                              ORDER BY RESOURCE_ID) AS NEXT_RESOURCE_ID,
    LEAD(SEQUENCE_NO,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                              ORDER BY RESOURCE_ID) AS NEXT_SEQ_NO,
    LEAD(SEQUENCE_NO,2) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
                              ORDER BY RESOURCE_ID) AS NEXT_NEXT_SEQ_NO
  FROM
    OPERATION
  WHERE
    WORKORDER_TYPE='M')
WHERE
  RN=1
  AND RESOURCE_ID='69'
  AND NEXT_RESOURCE_ID='FW';

WORKORDER_ WOR WOR WOR SEQUENCE_NO NEXT_SEQ_NO NEXT_NEXT_SEQ_NO NEW_SEQUENCE_NO RESOURCE_ID
---------- --- --- --- ----------- ----------- ---------------- --------------- ------------
231610     0   0   0            10         777              888             833 NEW RESOURCE
237680     0   0   0            10         777              888             833 NEW RESOURCE
32018917X  0   0   0            10         777              888             833 NEW RESOURCE
3201B8920  0   0   0            10         777              888             833 NEW RESOURCE
3201C8765  0   0   0            10         777              888             833 NEW RESOURCE
3201G8639  0   0   0            10         777              888             833 NEW RESOURCE
3201G9003  0   0   0            10         777              888             833 NEW RESOURCE
3201J8772  0   0   0            10         777              888             833 NEW RESOURCE
3201J8850  0   0   0            10         777              888             833 NEW RESOURCE
3201K8669  0   0   0            10         777              888             833 NEW RESOURCE
3201M8281  0   0   0            10         777              888             833 NEW RESOURCE
3201M8489  0   0   0            10         777              888             833 NEW RESOURCE
3201N8516  0   0   0            10         777              888             833 NEW RESOURCE
3201N8776  0   0   0            10         777              888             833 NEW RESOURCE
3201Q8545  0   0   0            10         777              888             833 NEW RESOURCE
3201T8964  0   0   0            10         777              888             833 NEW RESOURCE
3201T8964X 0   0   0            10          20               30              25 NEW RESOURCE
3201V8524  0   0   0            10         777              888             833 NEW RESOURCE
3201V8966  0   0   0            10         777              888             833 NEW RESOURCE
3201V8966X 0   0   0            10          20               30              25 NEW RESOURCE
3201W8967  0   0   0            10         777              888             833 NEW RESOURCE
3201W8967X 0   0   0            10          20               30              25 NEW RESOURCE
3201Y8423  0   0   0            10         777              888             833 NEW RESOURCE
3201Z8996  0   0   0            10         777              888             833 NEW RESOURCE

24 rows selected. 

With the above, we are well on our way to perfoming a mass insert into this table.  If I recall correctly, the above must first be inserted into another table (a temp table, preferrably) and then the rows may be inserted into the OPERATION table (the table that is the row source for the analytic functions used in the SQL statement).





Brain Teaser: Why is this Query Performing a Full Table Scan

14 09 2011

September 14, 2011

While taking a look at the OTN threads this morning, I found an interesting test case, where the OP was asking why a query used a full table scan and not an index range scan.  The OP would like for the query to use the index without using a hint in the query.  So, why doesn’t the OP’s test case use an index range scan?  Just for fun I will state that my first two initial guesses were not quite on target.

A slightly modified table creation script of the setup for the OP’s test case:

DROP TABLE T1 PURGE;

CREATE TABLE T1 AS
SELECT
  *
FROM
  DBA_OBJECTS
WHERE
  STATUS='VALID';

UPDATE T1 SET STATUS='INVALID' WHERE ROWNUM=1;
COMMIT;

CREATE INDEX IND_T1_STATUS ON T1(STATUS); 

Let’s take a look at the data distribution in the table:

SELECT
  STATUS,
  CNT,
  ROUND((RATIO_TO_REPORT(CNT) OVER ())*100,6) PERCENT
FROM
  (SELECT
    STATUS,
    COUNT(*) CNT
  FROM
    T1
  GROUP BY
    STATUS);

STATUS         CNT    PERCENT
------- ---------- ----------
INVALID          1    .001513
VALID        66095  99.998487 

99.998% of the table’s rows have a STATUS of VALID with just a single row having a STATUS of invalid.

Now let’s collect the statistics for the table and index and check the execution plan:

ANALYZE INDEX IND_T1_STATUS COMPUTE STATISTICS;
ANALYZE TABLE T1 COMPUTE STATISTICS;

SET AUTOTRACE TRACEONLY EXPLAIN
SET PAGESIZE 1000
SET LINESIZE 140

SELECT
  *
FROM
  T1
WHERE
  STATUS='INVALID';

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 33048 |  3227K|   265   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T1   | 33048 |  3227K|   265   (1)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("STATUS"='INVALID') 

If you were assigned to help the OP achieve his task with the test case, what would you do?





The New Order Oracle Coding Challenge 4 – Tic Tac Toe

12 08 2011

(Back to the Previous Post in the Series)

Tic Tac Toe, the game of X’s and O’s, was an oddly popular game in elementary school.  When playing the game you quickly learn a couple of rules:

  • Because X always places his mark first (alternating between X and O), there is an unfair advantage for the player placing the X marks.
  • The player placing his mark in the center square typically has an advantage in the game.
  • Once a player scores three marks in a row horizontally, vertically, or diagonally, there is no point in continuing the game.

-

-

-

-

-

-

-

Consider the above game matches.  In the first two matches both X and O score three in a row, but which player won the match – it depends on the order in which the marks were placed.  In the third match, X won with a diagnal three in a row.  The final match resulted in a tie (neither player won).  You can use a SQL statement similar to the following to output all of the roughly 362,000 combinations (note that not all combinations are unique – rotating the board 90 or 180 degrees to generate a unique combination probably is not fair, and the order in which the marks are placed could matter):

WITH
 N AS
  (SELECT
    ROWNUM N
  FROM
    DUAL
  CONNECT BY
    LEVEL<=9),
 C AS
  (SELECT
    'XOX'||'OXO'||'XOX' C
  FROM
    DUAL)
SELECT
  SUBSTR(C.C,N1.N,1) ||
  SUBSTR(C.C,N2.N,1) ||
  SUBSTR(C.C,N3.N,1) || CHR(10) ||
  SUBSTR(C.C,N4.N,1) ||
  SUBSTR(C.C,N5.N,1) ||
  SUBSTR(C.C,N6.N,1) || CHR(10) ||
  SUBSTR(C.C,N7.N,1) ||
  SUBSTR(C.C,N8.N,1) ||
  SUBSTR(C.C,N9.N,1) GAME
FROM
  N N1,
  N N2,
  N N3,
  N N4,
  N N5,
  N N6,
  N N7,
  N N8,
  N N9,
  C
WHERE
  N1.N<>N2.N
  AND N1.N<>N3.N
  AND N1.N<>N4.N
  AND N1.N<>N5.N
  AND N1.N<>N6.N
  AND N1.N<>N7.N
  AND N1.N<>N8.N
  AND N1.N<>N9.N
  AND N2.N<>N3.N
  AND N2.N<>N4.N
  AND N2.N<>N5.N
  AND N2.N<>N6.N
  AND N2.N<>N7.N
  AND N2.N<>N8.N
  AND N2.N<>N9.N
  AND N3.N<>N4.N
  AND N3.N<>N5.N
  AND N3.N<>N6.N
  AND N3.N<>N7.N
  AND N3.N<>N8.N
  AND N3.N<>N9.N
  AND N4.N<>N5.N
  AND N4.N<>N6.N
  AND N4.N<>N7.N
  AND N4.N<>N8.N
  AND N4.N<>N9.N
  AND N5.N<>N6.N
  AND N5.N<>N7.N
  AND N5.N<>N8.N
  AND N5.N<>N9.N
  AND N6.N<>N7.N
  AND N6.N<>N8.N
  AND N6.N<>N9.N
  AND N7.N<>N8.N
  AND N7.N<>N9.N
  AND N8.N<>N9.N; 

The output of the above SQL statement should appear similar to the following:

GAME
----
...

XXX
OOO
OXX

XXX
OOO
OXX

...

XOX
XOO
OXX

OXX
XOO
OXX

XXO
XOO
XOX

XXO
XOO
XOX

XOX
XOO
XOX 

...

The big problem with the above SQL statement is that it is not clear which player won in all cases.  Ideally, the N1.N value would be used to output an X mark in the specified position, the N2.N value would be used to output an O mark in a specified position, the N3.N value would be used to output an X in the specified position, etc. until one of the players places three marks in a row.  For example, if N1.N is 5, an X would be placed in the center square, if N2.N is 9, an O would be placed in the bottom right square.

Now that we know that the order in which the marks are placed is important, how would be know when a player wins?  You could experiment with the following:

Vertical win, with three positions having the same resulting values:

SELECT
  MOD(position - 1, 3) + 1 V
FROM
  DUAL; 

Horizontal win, with three positions having the same resulting values:

SELECT
  TRUNC((position - 1) / 3) + 1 V
FROM
  DUAL;

Diagnal win \ when V=0 in three positions:

SELECT
  (MOD(position - 1, 3) + 1) - (TRUNC((position - 1) / 3) + 1) V
FROM
  DUAL;

Diagnal win / when V=0 (watch out, could end up in a V pattern) or V=2:

SELECT
  ABS((MOD(position - 1, 3) + 1) - (TRUNC((position - 1) / 3) + 1)) V
FROM
  DUAL; 

———–

OK, now that I have explained the game, given you a couple of SQL statements to possibly help you with the solution… on to the challenge.  With the help of Oracle Database build a tic tac toe solver that will help a player win at tic tac toe.  Provided a current description of the board, for example ‘_X_’||’O_X’||’OX_’, display all board solutions that allows player O (or player X if his turn is next) to win.  (Side note: I have not yet completed the solution to this challenge – it might be possible to accomplish this challenge with just a SQL statement.)





The New Order Oracle Coding Challenge 3 – Mind Boggle

5 08 2011

August 5, 2011 (Modified August 7, 2011)

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

In part 1 of this series the challenge was to simply reverse the order of digits in the numbers from 1 to 1,000,000 to find that cases where the numbers formed by the reverse ordered digits evenly divided into the original number.  In part 2 of this series the challenge required examining all of the numbers between 1000 and 9999, where arranging the digits of the original number into any of 23 valid combinations resulted in a new number that cleanly divided into the original four digit number.  There were several different solutions provided to the two challenges, so now it is time to move on to part three of the series.

In part 1 of this blog article series I mentioned playing a game years ago that used letters on the face of dice – the dice were rolled, and then the challenge was to find all words that could be completely spelled using the letters on the top of the dice.  I was not very good at the game, so I enlisted the help of a computer.  One such dice game is called Boggle, and that game’s name is probably fitting for today’s challenge.  Imagine that you played this game and the following letters appeared on the top of the dice:

One of the rules of the game requires that words must be at least 3 letters in length, for example: you say melee eye (I) see elfs file some mail (OK, the word I is too short, but we can have some fun with the words that are found).  As you might be able to guess, there are a lot of possible combinations of the 16 letters found on the dice, some of which are valid words.  If we just consider the 5 letter, 4 letter, and 3 letter combinations of the dice, there are more than a half million possible combinations (in the following table, multiply the numbers across and add the results for each row) – no wonder I needed the computer’s help with these puzzles.

16 15 14 13 12   = 16! / 11!
16 15 14 13     = 16! / 12!
16 15 14       = 16! / 13!
            = 571,200

To make the full challenge of finding words a little easier, let’s break the challenge into a couple of parts:

 Part 1: Consider the 2 x 2 letter arrangement at the left.  With the help of Oracle Database, list all of the three letter combinations of those four letters.  There will be 4 * 3 * 2 = 24 possible combinations of the letters.

 

 

 Part 2: Consider the 4 x 4 letter arrangement at the left.  With the help of Oracle Database, list all of the four letter combinations of those 16 letters.  There will be 16 * 15 * 14 * 13 = 43,680 possible combinations of the letters.

-

-

-

-

-

Part 3: Consider the 4 x 4 letter arrangement above.  With the help of Oracle Database, list all of the three, four, five, and six letter combinations of those 16 letters.  If you see any seven letter words in the above set of letters, you might as well retrieve those letter combinations also.  How many letter combinations do you have in total for part 3?

Part 4: Extra Credit: How many of the letter combinations generated in part 3 above are valid U.S. or U.K. English words?  List the words.

Part 5: Extra, Extra Credit: List any words found in the letters at the left that have any connection to Oracle Corporation.  Remember that a letter can only be used as many times in a single word as it appears at the left (if you can form a word with three letter A’s that have a connection to Oracle Corp., go for it.).

-

-

-

-

-

-

-

-

-

-

-

-

-

-

Added August 7, 2011:

When I put together this challenge I did not think that it was possible to complete Part 4 Extra Credit using just SQL.  I was fairly certain that there were some interesting techniques to retrieve HTML content with the help of PL/SQL, but I had not worked out a solution that utilized that technique.  As I write this, Radoslav Golian in the comments section appears to have both a PL/SQL and a SQL solution that uses the dictionary.reference.com website to validate the words (only 6 words to avoid a denial of service type attack on the dictionary.reference.com website).  One of the approaches that I considered, but did not develop, is something similar to how Radoslav verified the words, but I would use a VBS script to submit the request and check the result as is demonstrated in these two articles: Submit Input to an ASP Web Page and Retrieve the Result using VBS and Use VBS to Search for Oracle Books using Google’s Book Library.

The solution that I put together for Part 4 Extra Credit started with an Excel macro that I posted in another blog article, which was then converted to PL/SQL.  I then transformed the PL/SQL for use in this article, and generated a new Excel macro from the PL/SQL code.  The Excel macro (along with the calling code looks like this:

Sub StartBoggle()
    Call Boggle("ESOIMEFOALEUSAYE", 6, 3)
End Sub

Sub Boggle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer)
    Dim i As Integer
    Dim strCharacter(20) As String
    Dim intCharacterIndex(20) As Integer
    Dim intCharacters As Integer
    Dim intCharactersMax As Integer
    Dim intCharactersMin As Integer
    Dim intNumberOfSuppliedCharacters As Integer
    Dim intAdjustmentPosition As Integer
    Dim intFlag As Integer
    Dim strOutput As String
    Dim strWords(10000) As String
    Dim intWordCount As Integer
    Dim intFilenum As Integer

    intFilenum = FreeFile
    Open "C:\Words " & strCharacters & ".txt" For Output As #intFilenum

    If intMaxWordLength = 0 Then
        intCharactersMax = Len(strCharacters)
    Else
        If intMaxWordLength <= Len(strCharacters) Then
            intCharactersMax = intMaxWordLength
        Else
            intCharactersMax = Len(strCharacters)
        End If
    End If

    If intMinWordLength = 0 Then
        intCharactersMin = 3
    Else
        If intMaxWordLength < intMinWordLength Then
            intCharactersMin = intCharactersMax
        Else
            intCharactersMin = intMinWordLength
        End If
    End If

    intNumberOfSuppliedCharacters = Len(strCharacters)

    For i = 1 To intNumberOfSuppliedCharacters
        strCharacter(i) = Mid(strCharacters, i, 1)
    Next i

    intCharacters = intCharactersMin - 1
    intWordCount = 0

    Do While intCharacters < intCharactersMax
        intCharacters = intCharacters + 1
        intAdjustmentPosition = 1
        For i = 1 To intCharacters
            intCharacterIndex(i) = i
        Next i

        Do While intAdjustmentPosition > 0
            intFlag = 0
            For i = 1 To intAdjustmentPosition - 1
                If intCharacterIndex(i) = intCharacterIndex(intAdjustmentPosition) Then
                    ' Found a duplicate index position in the other values to the left
                    intFlag = 1
                    Exit For
                End If
            Next i

            If intFlag = 1 Then
                ' Try the next index position in this element
                intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
            Else
                If intAdjustmentPosition = intCharacters Then
                    ' Output
                    strOutput = ""
                    For i = 1 To intCharacters
                        strOutput = strOutput & strCharacter(intCharacterIndex(i))
                    Next i

                    intFlag = 0
                    For i = intWordCount To 1 Step -1
                        If strOutput = strWords(i) Then
                            intFlag = 1
                            Exit For
                        End If
                    Next i
                    If intFlag = 0 Then
                        If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then
                            intWordCount = intWordCount + 1
                            strWords(intWordCount) = strOutput

                            Print #intFilenum, strOutput
                            Debug.Print strOutput
                        End If
                    End If

                    If intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters Then
                        ' No more available values in the last position
                        intCharacterIndex(intAdjustmentPosition) = 1
                        intAdjustmentPosition = intAdjustmentPosition - 1
                        If intAdjustmentPosition > 0 Then
                            intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
                        End If
                    Else
                        intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
                    End If
                Else
                    ' No duplicate so prepare to check the next position
                    intAdjustmentPosition = intAdjustmentPosition + 1
                End If
            End If

            Do While (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters)
                ' Roll back one index position as many times as necessary
                intCharacterIndex(intAdjustmentPosition) = 1
                intAdjustmentPosition = intAdjustmentPosition - 1
                If intAdjustmentPosition > 0 Then
                    intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
                End If
            Loop ' (intAdjustmentPosition > 0) And
        Loop 'intAdjustmentPosition > 0
    Loop 'intCharacters < intCharactersMax

    Close #intFilenum
End Sub 

The Excel macro builds letter combinations that are between the minimum and maximum length, and then tests those letter combinations using the built-in dictionary that is in Excel.  I had a little bit of difficulty coming up with a way to generate the letter combinations of variable length, so I settled on a custom developed technique – I would simply keep track of the original character positions, manipulate those original character positions, and then output the corresponding characters.  The challenge is then how does one verify that the same character position is not used more than once in a single word? 

 The method that I came up with is as follows, which assumes that we are trying to build four letter words from the supplied 16 letters.  We can start with the seed combination 1,2,3,4.  The idea is to work from left to right, and then back to the left.  Every time to make it to the right, we output a word, when we make it all the way back to the left (just before the number 1 in the above), we are done.  The rules are simple:

  • Increment the number in a position, and if that number does not appear in a position to the left, move one position to the right.
  • When the maximum character number (16 in this example) is exceeded in a position, reset the number to 1, move one position to the left, and increment the value in the new position by 1.
  • In the last position the character number should be incremented as many times as necessary to reach the maximum character number – each time a potential new combination will be generated.

 But there is a problem with this approach – it does not use Oracle Database!

-

Let’s go back to the PL/SQL function from which I created the Excel function (I have not worked much with pipelined functions – so there may be one or two errors):

CREATE OR REPLACE FUNCTION BOGGLE_VAR_LENGTH(strCHARACTERS IN VARCHAR2, intMaxWordLength IN NUMBER, intMinWordLength IN NUMBER) RETURN SYS.AQ$_MIDARRAY PIPELINED
AS
  TYPE NUMBER_ARRAY IS TABLE OF NUMBER INDEX BY PLS_INTEGER;
  TYPE CHARACTER_ARRAY IS TABLE OF VARCHAR(1) INDEX BY PLS_INTEGER;
  strCharacter CHARACTER_ARRAY;
  intCharacterIndex NUMBER_ARRAY;
  intCharacters NUMBER;
  intCharactersMax NUMBER;
  intCharactersMin NUMBER;
  intNumberOfSuppliedCharacters NUMBER;
  intAdjustmentPosition NUMBER;
  intFlag NUMBER;
  intI NUMBER;
  strOutput VARCHAR2(100);
BEGIN
  IF intMaxWordLength IS NULL THEN
    intCharactersMax := LENGTH(strCHARACTERS);
  ELSE
    IF intMaxWordLength <= LENGTH(strCHARACTERS) THEN
      intCharactersMax := intMaxWordLength;
    ELSE
      intCharactersMax := LENGTH(strCHARACTERS);
    END IF;
  END IF;

  IF intMinWordLength IS NULL THEN
    intCharactersMin := 3;
  ELSE
    IF intMaxWordLength < intMinWordLength THEN
      intCharactersMin := intCharactersMax;
    ELSE
      intCharactersMin := intMinWordLength;
    END IF;
  END IF;

  intNumberOfSuppliedCharacters := LENGTH(strCHARACTERS);

  FOR I IN 1.. intNumberOfSuppliedCharacters LOOP
    strCharacter(I) := SUBSTR(strCHARACTERS, I, 1);
  END LOOP;

  intCharacters := intCharactersMin - 1;
  WHILE intCharacters < intCharactersMax LOOP
    intCharacters := intCharacters + 1;
    intAdjustmentPosition := 1;
    FOR I IN 1 .. intCharacters LOOP
      intCharacterIndex(I) := I;
    END LOOP;

    WHILE intAdjustmentPosition > 0 LOOP
      intFlag := 0;
      FOR I IN 1 .. intAdjustmentPosition - 1 LOOP
        IF intCharacterIndex(I) = intCharacterIndex(intAdjustmentPosition) Then
          -- Found a duplicate index position in the other values to the left
          intFlag := 1;
        END IF;
      END LOOP;
      IF intFlag = 1 Then
        -- Try the next index position in this element
        intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
      ELSE
        IF intAdjustmentPosition = intCharacters Then
          -- Output
          strOutput := '';
          FOR i IN 1 .. intCharacters LOOP
            strOutput := strOutput || strCharacter(intCharacterIndex(i));
          END LOOP;

          PIPE ROW (strOutput);

          IF intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters THEN
            -- No more available values in the last position
            intCharacterIndex(intAdjustmentPosition) := 1;
            intAdjustmentPosition := intAdjustmentPosition - 1;
            IF intAdjustmentPosition > 0 THEN
              intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
            END IF;
          ELSE
            intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
          END IF;
        ELSE
          -- No duplicate so prepare to check the next position
          intAdjustmentPosition := intAdjustmentPosition + 1;
        END IF;
      END IF;

      WHILE (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters) LOOP
        -- Roll back one index position as many times as necessary
        intCharacterIndex(intAdjustmentPosition) := 1;
        intAdjustmentPosition := intAdjustmentPosition - 1;
        IF intAdjustmentPosition > 0 THEN
          intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
        END IF;
      END LOOP;
    END LOOP;
  END LOOP;
END;
/ 

 We are able to call the function from a SQL statement like this:

SELECT
  *
FROM
  TABLE(BOGGLE_VAR_LENGTH('ESOIMEFOALEUSAYE', 6, 3)); 

Remember that there are more than a half million character combinations for just the 3, 4, and 5 letter combinations – the above will as for 6,336,960 letter combinations to be generated.   But there is a problem with this approach – it does not verify that the letter combinations are actual words!

For fun, let’s see how many possible combinations will result if we allow 3, 4, 5, 6, 7, and 8 letter combinations:

Len                 Combinations  
8 16 15 14 13 12 11 10 9 518,918,400 = 16! / 8!
7 16 15 14 13 12 11 10   57,657,600 = 16! / 9!
6 16 15 14 13 12 11     5,765,760 = 16! / 10!
5 16 15 14 13 12       524,160 = 16! / 11!
4 16 15 14 13         43,680 = 16! / 12!
3 16 15 14           3,360 = 16! / 13!
                  582,912,960 582,912,960

That is more than a half billion combinations!  Warning, significant database server CPU consumption will result when generating all combinations.

Let’s take a look at the final solution that I created for Part 4 Extra, Extra Credit.  The solution is an Excel macro that calls the PL/SQL function through a SQL statement:

Sub StartBoggleOracle()
    Call BoggleOracle("ESOIMEFOALEUSAYE", 8, 3)
End Sub

Sub BoggleOracle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer)
    Dim strSQL As String
    Dim strUsername As String
    Dim strPassword As String
    Dim strDatabase As String
    Dim intFilenum As Integer

    Dim intCharacters As Integer
    Dim intCharactersMax As Integer
    Dim intCharactersMin As Integer
    Dim strOutput As String

    Dim dbDatabase As ADODB.Connection
    Dim snpData As ADODB.Recordset

    Set dbDatabase = New ADODB.Connection
    Set snpData = New ADODB.Recordset

    strUsername = "MyUsername"
    strPassword = "MyPassword"
    strDatabase = "MyDatabase"

    dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";FetchSize=5000;"
    dbDatabase.Open

    intFilenum = FreeFile
    Open "C:\WordsOracle " & strCharacters & ".txt" For Output As #intFilenum

    If intMaxWordLength = 0 Then
        intCharactersMax = Len(strCharacters)
    Else
        If intMaxWordLength <= Len(strCharacters) Then
            intCharactersMax = intMaxWordLength
        Else
            intCharactersMax = Len(strCharacters)
        End If
    End If

    If intMinWordLength = 0 Then
        intCharactersMin = 3
    Else
        If intMaxWordLength < intMinWordLength Then
            intCharactersMin = intCharactersMax
        Else
            intCharactersMin = intMinWordLength
        End If
    End If

    strSQL = "SELECT DISTINCT" & vbCrLf
    strSQL = strSQL & "  *" & vbCrLf
    strSQL = strSQL & "FROM" & vbCrLf
    strSQL = strSQL & "  (SELECT" & vbCrLf
    strSQL = strSQL & "    *" & vbCrLf
    strSQL = strSQL & "  FROM" & vbCrLf
    strSQL = strSQL & "    TABLE(BOGGLE_VAR_LENGTH('" & strCharacters & "', " & Format(intCharactersMax) & ", " & Format(intCharactersMin) & ")))" & vbCrLf
    strSQL = strSQL & "ORDER BY" & vbCrLf
    strSQL = strSQL & "  1"
    snpData.Open strSQL, dbDatabase

    If snpData.State = 1 Then
        Do While Not snpData.EOF
            strOutput = snpData(0)
            If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then
                Print #intFilenum, strOutput
                Debug.Print strOutput
            End If

            snpData.MoveNext
        Loop

        snpData.Close
    End If

    Close #intFilenum
    dbDatabase.Close
    Set snpData = Nothing
    Set dbDatabase = Nothing
End Sub

The words found appear to depend on the version of Excel – Excel 2010 seems to find more words than Excel 2007.

  • The 799 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished.  Words Oracle_ESOIMEFOALEUSAYE.txt
  • The 2,179 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished. Words Oracle_OSERIEFAARLNCAYL.txt

Excel found Ellison in the second word list.  For Part 5 Extra, Extra Credit, what other words connected to Oracle Corporation were found?





The New Order Oracle Coding Challenge 2

2 08 2011

August 2, 2011

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

In the previous article in this series we tried to develop different methods for identifying special numbers between 1 and 1,000,000, such that the number formed by reversing the order of the digits will evenly divide into the original number.  I initially predicted that there would be at least three unique solutions provided before the undocumented REVERSE function was mentioned.  A fellow Oak Table member quickly provided a solution using the REVERSE function, which happens to be the fastest of the various approaches.  Another reader noticed a pattern that helps find those special numbers – if that pattern identifies all of the special numbers, it is easily the most efficient approach to finding these special numbers.

Today’s blog article also is on the topic of finding special numbers, determined as special after changing the order of the number’s digits.  I used my fancy, on-point introduction for this article in the earlier article (letters on the dice, trying to form words from those letters), so I thought that I would jump the little pony over the edge of the cliff in search of another introduction (there are probably five readers with an odd image stuck in their head at this moment).   ;-)

Have you ever noticed the disparity between the Environment Protection Agency’s (EPA or the equivalent in other countries) estimates for fuel efficiency and the actual fuel efficiency calculated with the hand held calculator?  Take for instance a car that I own – that car has never achieved the EPA’s estimated fuel efficiency rating.  A 412 horsepower (at the flywheel as claimed by the manufacturer, other tests have concluded that it might be higher) V8 achieving 25 miles per U.S. gallon (MPG) of fuel?

Time for the Oracle Database test case.  On a frigid 32 degree day (in Celsius, that is a hot 90 degrees Fahrenheit), I lowered the windows, topped off the fuel tank with premium grade fuel, reset the car’s fuel efficiency gauge (the MPG gauge… oddly, this gauge always reads 6% to 8% lower than the actual fuel efficiency), and then set out on a roughly 117 mile (188 KM) drive down the highway with the radar detector on the dash board.  While I have heard that the car’s speed limit is set at the factory to roughly 160 miles per hour (258 KM/H), the posted speed limit was typically 55 MPH (miles per hour) except in the roughly half-dozen or so cities along the way.  The car never did hit the EPA’s estimated fuel efficiency rating.  Somebody was playing with the numbers a bit, I think. :-)

-

-

-

-

-

-

-

-

Along the route selected for the 117 mile test loop, I captured cell phone pictures showing the car’s fuel efficiency gauge (MPG gauge).  Below are the cell phone pictures captured at 15 miles into the drive, roughly half way through the drive, and near the end of the 117 mile drive. 

So, how does Oracle Database play a part in the introduction to this blog article?  Let’s take a guess at the car’s actual fuel efficiency using a SQL statement:

SELECT
  29.5 CAR,
  ROUND(29.5 * 1.06, 2) LOW_ACTUAL,
  ROUND(29.5 * 1.08, 2) HIGH_ACTUAL
FROM
  DUAL;

       CAR LOW_ACTUAL HIGH_ACTUAL
---------- ---------- -----------
      29.5      31.27       31.86

So much for EPA estimates – between 25.08% and 27.44% lower than the actual fuel efficiency.  Does this test case then suggest that if we do not use all of the features that are available in a tool, that we are then more efficient than expected?  I think that I need another test case to decide.  ;-)

Side note: For the tank of fuel when the car rolled over the 1,000 mile mark, the car achieved 28.75 miles per gallon while touring some of the sites around Michigan, specifically the hills and curves along the north-west shoreline of the lower peninsula (several of those hills were steep – for one hill in particular the car downshifted three gears to maintain speed going down the hill).  The car’s overall fuel efficiency since it left the factory is 25.58 MPG (excluding the 117 mile test case), so I guess that one could argue that if you TRUNC the fuel efficiency number, at the root the EPA is right.  Make like a tree and leave (that was a movie quote from “Back to the Future”).

After returning from the test case I set out on my other 412 HP vehicle… cough, that would be a 4 cylinder 12 horsepower 1946 Farmall A tractor.  Time to mow the lawn in second gear.  Cutting a 6 foot (1.83 meter) wide path through grass that is up to 12 inches (30.48 cm) high, and capable of mowing down 2 inch diameter trees as necessary.  Oh, the power of the horse…

OK, now that we have drifted far from the topic of this article and have had a chance to play with numbers in the long introduction, on to the challenge.  Consider the four digit integer numbers between 1,000 and 9,999.  There are 4! (1 * 2 * 3 * 4 = 24) possible combinations of the four digits of each of these numbers.  For example, consider the number 8712 – the 24 possible combinations are:

1278
1287
1728
1782
1827
1872
2178
2187
2718
2781
2817
2871
7128
7182
7218
7281
7812
7821
8127
8172
8217
8271
8712 8721

The objective is to find all combinations of the four digit numbers between 1,000 and 9,999 that evenly divide into the original number.  For the above example with the original number 8712, the combination 2178 evenly divides into 8712.  Thus, the output would include:

ORIGINAL COMBINATION
-------- -----------
    8712        2178

The restrictions:

  • The four digit original numbers between 1,000 and 9,999 should not be included in the output if the last digit of those numbers is 0 – the combination number may end with 0 if the original number included a 0 digit.  Thus, do not include the original numbers 1000, 1010, 1020, 1030, 1040, etc.
  • The original value cannot be equal to the combination value (in the above example, it is not valid to output the combination value 8712).
  • Each original number and combination pair should be output at most one time.
  • You may not reuse a digit position twice; all digit positions must be used once.  For example, the number 7141 includes the digit 1 twice, therefore, all generated combinations of that number must include two number 1 digits.

This challenge is a bit more difficult than was the challenge in part 1 of this series.  The greatest difficulty likely involves creating the 23 valid digit combinations (there are 24, but one of those combinations will always be invalid) of the four digit numbers. 

The problem is solvable – will your solution be the same as mine?  I will give the readers a couple of days to ponder the thoughts of why a horse entered a race between a car and a tractor before sharing my approach to solving the problem.








Follow

Get every new post delivered to your Inbox.

Join 143 other followers