April 13, 2010
Recently, a simple request came in from an ERP mailing list. The author wanted to modify a report used by the purchasing module of the ERP program to show additional information on the report. The purchasing module retrieves the data from the database, manipulates the data, and then pushes the manipulated data to a reporting package. Wisely, the ERP developer allowed end users to add “extended queries” to bring in additional information from the database to be displayed on the report – all that the end user needs to do is to select from a list of variables provided by the ERP system to link to the database tables, and the ERP system will automatically pass the request to the database using bind variables. Simple, right?
The table definitions look like this (primary key columns denoted by *):
DESC PURCHASE_ORDER Name Null? Type ------------------- -------- ------------ ID NOT NULL VARCHAR2(15) * VENDOR_ID NOT NULL VARCHAR2(15) CONTACT_FIRST_NAME VARCHAR2(30) CONTACT_LAST_NAME VARCHAR2(30) CONTACT_INITIAL VARCHAR2(2) ... DESC PURC_ORDER_LINE Name Null? Type ------------------- -------- ------------ PURC_ORDER_ID NOT NULL VARCHAR2(15) * LINE_NO NOT NULL NUMBER * PART_ID VARCHAR2(30) VENDOR_PART_ID VARCHAR2(30) SERVICE_ID VARCHAR2(15) USER_ORDER_QTY NOT NULL NUMBER(14,4) ORDER_QTY NOT NULL NUMBER(14,4) ... DESC PURC_LINE_DEL Name Null? Type ------------------- -------- ------------ PURC_ORDER_ID NOT NULL VARCHAR2(15) * PURC_ORDER_LINE_NO NOT NULL NUMBER * DEL_SCHED_LINE_NO NOT NULL NUMBER * DESIRED_RECV_DATE DATE ACTUAL_RECV_DATE DATE USER_ORDER_QTY NOT NULL NUMBER(14,4) RECEIVED_QTY NUMBER(14,4) ... DESC PART Name Null? Type ------------------- -------- ------------ ID NOT NULL VARCHAR2(30) * DESCRIPTION VARCHAR2(40) STOCK_UM NOT NULL VARCHAR2(15) PLANNING_LEADTIME NOT NULL NUMBER ORDER_POLICY NOT NULL CHAR(1) ORDER_POINT NUMBER(14,4) ...
The PURCHASE_ORDER table contains the header information for the order, the PURC_ORDER_LINE table contains the line level detail for the order, and the PURC_LINE_DEL table contains the dates on which the line level detail should be provided to the company and the requested quantities for each of the dates. While the delivery schedule’s date and order quantities already appear on the purchase order report, the previously received quantity and the delivery schedule line number need to be included on the report, along with a bit of other information. The problem? The ERP system does not provide a variable to target a specific delivery schedule line, but we do have the PURC_ORDER_ID (PO_ID variable), PURC_ORDER_LINE_NO (LN_LINE_NO), DESIRED_RECV_DATE (LN_DEL_SCHED_DATE), and USER_ORDER_QTY (LN_DEL_SCHED_QTY), so the first thought is to construct an extended query like this (line breaks are not permitted in the extended query, so this might be hard to read):
SELECT POL.PRODUCT_CODE , PO.STATUS , PLD.RECEIVED_QTY , PLD.DEL_SCHED_LINE_NO FROM PART P , PURCHASE_ORDER PO , PURC_ORDER_LINE POL , PURC_LINE_DEL PLD WHERE PO.ID = :pO_ID AND PO.ID=POL.PURC_ORDER_ID AND POL.LINE_NO = :LN_LINE_NO AND POL.PART_ID = P.ID(+) AND POL.PURC_ORDER_ID=PLD.PURC_ORDER_ID(+) AND POL.LINE_NO = PLD.PURC_ORDER_LINE_NO(+) AND PLD.DESIRED_RECV_DATE(+) = :LN_DEL_SCHED_DATE AND PLD.USER_ORDER_QTY(+) = :LN_DEL_SCHED_QTY
On trying the purchase order report with the extended query we are greeted with an invalid number error thrown by Oracle. Invalid number? Here is what a 10046 trace at level 4 showed:
BINDS #30: kkscoacd Bind#0 oacdty=96 mxl=32(06) mxlc=00 mal=00 scl=00 pre=00 oacflg=01 fl2=1000000 frm=01 csi=178 siz=320 off=0 kxsbbbfp=21c00ac8 bln=32 avl=06 flg=05 value="144038" Bind#1 oacdty=01 mxl=32(01) mxlc=00 mal=00 scl=00 pre=00 oacflg=21 fl2=1000000 frm=01 csi=178 siz=0 off=32 kxsbbbfp=21c00ae8 bln=32 avl=01 flg=01 value="1" Bind#2 oacdty=96 mxl=128(85) mxlc=00 mal=00 scl=00 pre=00 oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=64 kxsbbbfp=21c00b08 bln=128 avl=85 flg=01 value="2/1/2010 2/1/2010 3/22/2010 4/13/2010 4/16/2010 5/14/2010 6/11/2010 7/9/2010 " Bind#3 oacdty=96 mxl=128(49) mxlc=00 mal=00 scl=00 pre=00 oacflg=01 fl2=1000000 frm=01 csi=178 siz=0 off=192 kxsbbbfp=21c00b88 bln=128 avl=49 flg=01 value="8.00 8.00 8.00 16.00 0.00 8.00 8.00 8.00 "
Line 1 of this purchase order has 8 delivery schedule lines, and in this silly situation the extended query is executed only at the PURC_ORDER_LINE level, and not at the delivery schedule level – all of the delivery schedule information is passed into the report with the multiple lines in a single field, with CRLF characters separating the values in each of the delivery schedule lines. OK, that was unexpected. Now what, do we give up? No, we use a little creativity to format the data in the required format, with all of the delivery schedule rows rolled in a single row:
SELECT P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.DEL_SCHED_LINE_NO,', '),3),', ',CHR(13)||CHR(10))) DEL_SCHED_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.RECEIVED_QTY,', '),3),', ',CHR(13)||CHR(10))) RECEIVED_QTY FROM (SELECT PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO, PLD.DEL_SCHED_LINE_NO, PLD.RECEIVED_QTY, ROW_NUMBER() OVER (PARTITION BY PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO ORDER BY PLD.DEL_SCHED_LINE_NO) RN FROM PURC_LINE_DEL PLD WHERE PLD.PURC_ORDER_ID='144038' AND PLD.PURC_ORDER_LINE_NO=1) P CONNECT BY PRIOR RN=RN-1 START WITH RN=1 GROUP BY P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO; PURC_ORDER_ID PURC_ORDER_LINE_NO DEL_SCHED_LINE_ RECEIVED_QTY --------------- ------------------ --------------- ------------ 144038 1 1 8 2 8 3 8 4 16 5 0 6 0 7 0 8 0
Easy, right? The execution plan of the above looks like this:
--------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 51 | 4 (0)| 00:00:01 | | 1 | SORT GROUP BY | | 1 | 51 | 4 (0)| 00:00:01 | |* 2 | CONNECT BY WITH FILTERING | | | | | | |* 3 | FILTER | | | | | | | 4 | COUNT | | | | | | | 5 | VIEW | | 12 | 612 | 4 (0)| 00:00:01 | | 6 | WINDOW BUFFER | | 12 | 228 | 4 (0)| 00:00:01 | | 7 | TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL | 12 | 228 | 4 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | SYS_C005716 | 12 | | 2 (0)| 00:00:01 | |* 9 | HASH JOIN | | | | | | | 10 | CONNECT BY PUMP | | | | | | | 11 | COUNT | | | | | | | 12 | VIEW | | 12 | 612 | 4 (0)| 00:00:01 | | 13 | WINDOW BUFFER | | 12 | 228 | 4 (0)| 00:00:01 | | 14 | TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL | 12 | 228 | 4 (0)| 00:00:01 | |* 15 | INDEX RANGE SCAN | SYS_C005716 | 12 | | 2 (0)| 00:00:01 | | 16 | COUNT | | | | | | | 17 | VIEW | | 12 | 612 | 4 (0)| 00:00:01 | | 18 | WINDOW BUFFER | | 12 | 228 | 4 (0)| 00:00:01 | | 19 | TABLE ACCESS BY INDEX ROWID | PURC_LINE_DEL | 12 | 228 | 4 (0)| 00:00:01 | |* 20 | INDEX RANGE SCAN | SYS_C005716 | 12 | | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("RN"-1=PRIOR "RN") 3 - filter("RN"=1) 8 - access("PLD"."PURC_ORDER_ID"='144038' AND "PLD"."PURC_ORDER_LINE_NO"=1) filter("PLD"."PURC_ORDER_ID"='144038' AND "PLD"."PURC_ORDER_LINE_NO"=1) 9 - access("RN"-1=PRIOR "RN") 15 - access("PLD"."PURC_ORDER_ID"='144038' AND "PLD"."PURC_ORDER_LINE_NO"=1) filter("PLD"."PURC_ORDER_ID"='144038' AND "PLD"."PURC_ORDER_LINE_NO"=1) 20 - access("PLD"."PURC_ORDER_ID"='144038' AND "PLD"."PURC_ORDER_LINE_NO"=1) filter("PLD"."PURC_ORDER_ID"='144038' AND "PLD"."PURC_ORDER_LINE_NO"=1)
The above is fairly efficient, using the primary key index on the table. We will just slide the above into an inline view in the report’s extended query (replacing the hardcoded literals with variables). The following appears when trying to display the report:
Missing expression… 😦 how about what happened to the rest of the SQL statement? Painted into a corner, again. Now what, do we give up?
OK, no problem, just define a static view rather than wrapping our SQL statement into an inline view. So, in testing it would look like this (the WHERE clause will be removed when it is created as a static view since the WHERE clause will be provided by the ERP package’s extended query):
SELECTP.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.DEL_SCHED_LINE_NO,', '),3),', ',CHR(13)||CHR(10))) DEL_SCHED_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.RECEIVED_QTY,', '),3),', ',CHR(13)||CHR(10))) RECEIVED_QTY FROM (SELECT PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO, PLD.DEL_SCHED_LINE_NO, PLD.RECEIVED_QTY, ROW_NUMBER() OVER (PARTITION BY PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO ORDER BY PLD.DEL_SCHED_LINE_NO) RN FROM PURC_LINE_DEL PLD) P WHERE P.PURC_ORDER_ID='144038' AND P.PURC_ORDER_LINE_NO=1 CONNECT BY PRIOR P.RN=P.RN-1 START WITH P.RN=1 GROUP BY P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO;
The execution plan for the above looks like this:
----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 49018 | 2441K| | 273 (4)| 00:00:02 | | 1 | SORT GROUP BY | | 49018 | 2441K| | 273 (4)| 00:00:02 | |* 2 | FILTER | | | | | | | |* 3 | CONNECT BY WITH FILTERING| | | | | | | |* 4 | FILTER | | | | | | | | 5 | COUNT | | | | | | | | 6 | VIEW | | 49018 | 2441K| | 273 (4)| 00:00:02 | | 7 | WINDOW SORT | | 49018 | 909K| 3096K| 273 (4)| 00:00:02 | | 8 | TABLE ACCESS FULL | PURC_LINE_DEL | 49018 | 909K| | 24 (5)| 00:00:01 | |* 9 | HASH JOIN | | | | | | | | 10 | CONNECT BY PUMP | | | | | | | | 11 | COUNT | | | | | | | | 12 | VIEW | | 49018 | 2441K| | 273 (4)| 00:00:02 | | 13 | WINDOW SORT | | 49018 | 909K| 3096K| 273 (4)| 00:00:02 | | 14 | TABLE ACCESS FULL | PURC_LINE_DEL | 49018 | 909K| | 24 (5)| 00:00:01 | | 15 | COUNT | | | | | | | | 16 | VIEW | | 49018 | 2441K| | 273 (4)| 00:00:02 | | 17 | WINDOW SORT | | 49018 | 909K| 3096K| 273 (4)| 00:00:02 | | 18 | TABLE ACCESS FULL | PURC_LINE_DEL | 49018 | 909K| | 24 (5)| 00:00:01 | ----------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("P"."PURC_ORDER_ID"='144038' AND "P"."PURC_ORDER_LINE_NO"=1) 3 - access("P"."RN"-1=PRIOR "P"."RN") 4 - filter("P"."RN"=1) 9 - access("P"."RN"-1=PRIOR "P"."RN")
The predicted time for execution is 2 seconds, and if there are 10 order lines ~~~ 2*10 = 20 seconds for the report to display. The end user will complain, but let’s test just in case the predicted time is incorrect. Executing the query… Returning in an hour… Why is this SQL statement still running? Notice that the predicate information in the plan, unlike for the plan of the first SQL statement, does not include access(“PLD”.”PURC_ORDER_ID”=’144038′ AND “PLD”.”PURC_ORDER_LINE_NO”=1) on any of the lines – that predicate information was not pushed into the inline view, and providing a hint to push the predicates into the inline view does not change the execution plan. The end user definitely will complain if it takes 10+ hours to print one report with the previously received quantities. Painted into a corner, again. Now what, do we give up?
If only we could somehow push at least the PURC_ORDER_ID further into the view. Any ideas?
One possible solution is to create another table in the database, something like this:
CREATE TABLE CURRENT_PO_USER( USERNAME VARCHAR2(30) DEFAULT USER, PURC_ORDER_ID VARCHAR2(15), PRIMARY KEY(USERNAME)); GRANT ALL ON CURRENT_PO_USER TO PUBLIC;
We could then (somehow) do this before displaying the purchase order report:
DELETE FROM CURRENT_PO_USER WHERE USERNAME=USER; INSERT INTO CURRENT_PO_USER(PURC_ORDER_ID) VALUES ('144038'); COMMIT;
Our test query then looks like this:
SELECT P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.DEL_SCHED_LINE_NO,', '),3),', ',CHR(13)||CHR(10))) DEL_SCHED_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.RECEIVED_QTY,', '),3),', ',CHR(13)||CHR(10))) RECEIVED_QTY FROM (SELECT /*+ ORDERED */ PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO, PLD.DEL_SCHED_LINE_NO, PLD.RECEIVED_QTY, ROW_NUMBER() OVER (PARTITION BY PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO ORDER BY PLD.DEL_SCHED_LINE_NO) RN FROM current_po_user CPU, PURC_LINE_DEL PLD WHERE CPU.USERNAME=USER AND CPU.PURC_ORDER_ID=PLD.PURC_ORDER_ID) P WHERE P.PURC_ORDER_ID='144038' AND P.PURC_ORDER_LINE_NO=1 CONNECT BY PRIOR RN=RN-1 START WITH RN=1 GROUP BY P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO; PURC_ORDER_ID PURC_ORDER_LINE_NO DEL_SCHED_LINE_ RECEIVED_QTY --------------- ------------------ --------------- ------------ 144038 1 1 8 2 8 3 8 4 0 5 0 6 0 7 0 8 0
The query results are returned instantaneously, with an execution plan that looks like this:
------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 6 | 306 | 3 (0)| 00:00:01 | | 1 | SORT GROUP BY | | 6 | 306 | 3 (0)| 00:00:01 | |* 2 | FILTER | | | | | | |* 3 | CONNECT BY WITH FILTERING | | | | | | |* 4 | FILTER | | | | | | | 5 | COUNT | | | | | | | 6 | VIEW | | 6 | 306 | 3 (0)| 00:00:01 | | 7 | WINDOW BUFFER | | 6 | 270 | 3 (0)| 00:00:01 | | 8 | NESTED LOOPS | | 6 | 270 | 3 (0)| 00:00:01 | | 9 | TABLE ACCESS BY INDEX ROWID| CURRENT_PO_USER | 1 | 26 | 1 (0)| 00:00:01 | |* 10 | INDEX UNIQUE SCAN | SYS_C0022556 | 1 | | 1 (0)| 00:00:01 | | 11 | TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL | 6 | 114 | 2 (0)| 00:00:01 | |* 12 | INDEX RANGE SCAN | SYS_C005716 | 6 | | 1 (0)| 00:00:01 | |* 13 | HASH JOIN | | | | | | | 14 | CONNECT BY PUMP | | | | | | | 15 | COUNT | | | | | | | 16 | VIEW | | 6 | 306 | 3 (0)| 00:00:01 | | 17 | WINDOW BUFFER | | 6 | 270 | 3 (0)| 00:00:01 | | 18 | NESTED LOOPS | | 6 | 270 | 3 (0)| 00:00:01 | | 19 | TABLE ACCESS BY INDEX ROWID| CURRENT_PO_USER | 1 | 26 | 1 (0)| 00:00:01 | |* 20 | INDEX UNIQUE SCAN | SYS_C0022556 | 1 | | 1 (0)| 00:00:01 | | 21 | TABLE ACCESS BY INDEX ROWID| PURC_LINE_DEL | 6 | 114 | 2 (0)| 00:00:01 | |* 22 | INDEX RANGE SCAN | SYS_C005716 | 6 | | 1 (0)| 00:00:01 | | 23 | COUNT | | | | | | | 24 | VIEW | | 6 | 306 | 3 (0)| 00:00:01 | | 25 | WINDOW BUFFER | | 6 | 270 | 3 (0)| 00:00:01 | | 26 | NESTED LOOPS | | 6 | 270 | 3 (0)| 00:00:01 | | 27 | TABLE ACCESS BY INDEX ROWID | CURRENT_PO_USER | 1 | 26 | 1 (0)| 00:00:01 | |* 28 | INDEX UNIQUE SCAN | SYS_C0022556 | 1 | | 1 (0)| 00:00:01 | | 29 | TABLE ACCESS BY INDEX ROWID | PURC_LINE_DEL | 6 | 114 | 2 (0)| 00:00:01 | |* 30 | INDEX RANGE SCAN | SYS_C005716 | 6 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("P"."PURC_ORDER_ID"='144038' AND "P"."PURC_ORDER_LINE_NO"=1) 3 - access("RN"-1=PRIOR "RN") 4 - filter("RN"=1) 10 - access("CPU"."USERNAME"=USER@!) 12 - access("CPU"."PURC_ORDER_ID"="PLD"."PURC_ORDER_ID") 13 - access("RN"-1=PRIOR "RN") 20 - access("CPU"."USERNAME"=USER@!) 22 - access("CPU"."PURC_ORDER_ID"="PLD"."PURC_ORDER_ID") 28 - access("CPU"."USERNAME"=USER@!) 30 - access("CPU"."PURC_ORDER_ID"="PLD"."PURC_ORDER_ID") Note ----- - dynamic sampling used for this statement
The next step, of course, is to create a statically defined view for our SQL statement:
CREATE VIEW PURC_LINE_DEL_PO_REPORT AS SELECT P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.DEL_SCHED_LINE_NO,', '),3),', ',CHR(13)||CHR(10))) DEL_SCHED_LINE_NO, MAX(REPLACE(SUBSTR(SYS_CONNECT_BY_PATH(P.RECEIVED_QTY,', '),3),', ',CHR(13)||CHR(10))) RECEIVED_QTY FROM (SELECT /*+ ORDERED */ PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO, PLD.DEL_SCHED_LINE_NO, PLD.RECEIVED_QTY, ROW_NUMBER() OVER (PARTITION BY PLD.PURC_ORDER_ID, PLD.PURC_ORDER_LINE_NO ORDER BY PLD.DEL_SCHED_LINE_NO) RN FROM current_po_user CPU, PURC_LINE_DEL PLD WHERE CPU.USERNAME=USER AND CPU.PURC_ORDER_ID=PLD.PURC_ORDER_ID) P CONNECT BY PRIOR RN=RN-1 START WITH RN=1 GROUP BY P.PURC_ORDER_ID, P.PURC_ORDER_LINE_NO; GRANT SELECT ON PURC_LINE_DEL_PO_REPORT TO PUBLIC;
But how do we populate the CURRENT_PO_USER table? The ERP module supports the execution of VBScript macros when a purchase order is opened and when it is saved. So, we just create a VBScript macro that populates that table as needed. Painted into a corner, but fortunately there is a door in that corner of the room.
Maybe there is a better way that does not require the custom table and the custom VBScript macros?
that predicate information was not pushed into the inline view, and providing a hint to push the predicates into the inline view does not change the execution plan.
Is that because oracle thinks pushing the predicate will change the query semantically (and hence does not do it) or is it just an optimizer limitation (or bug)?
I believe that the reason for the predicate information not being pushed in is a result of the CONNECT BY syntax. It appears that when a CONNECT BY clause is used, the START WITH clause is considered a WHERE clause, according to a portion of a 10053 trace:
I probably should have originally written the query like this:
That probably would have helped performance a bit (roughly 13 seconds compared to a very, very long time).
I could have also tried an index hint:
Or, specifying the purchase order in the CONNECT BY clause to drop the time to about 2 seconds (not an option if this needs to be a view):
Or in the START WITH clause to execute in about 1 second (also not an option if this will be a view):
Incidentally, the 10053 trace file section for the last query, that is from the same section of the 10053 trace file as was shown above, follows:
In the above, the FPD lines show that the purchase order ID and line were pushed into SEL$9CAB97DB, and that push is also evident in the predicate information section of the execution plan.