SELECT Statement is Fast, INSERT INTO Using the SELECT Statement is Brutally Slow 1

15 12 2010

December 15, 2010 

(Forward to the Next Post in the Series)

I am impressed with the responses that were received from the most recent set of blog articles.  I recently encountered an interesting problem with an Oracle database that had the OPTIMIZER_FEATURES_ENABLE parameter set to 10.1.0.4 and the OPTIMIZER_MODE set to ALL_ROWS.  A somewhat complex SQL statement was written like the following, using LEADING hints to help control the join order (to work around a potential problem related to join order when inline views are used):

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      EDI830 E
    GROUP BY
      E.CUSTPO) E,
    EDI830_LINES EL
  WHERE 
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E,
  (SELECT
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    CUSTOMER_ORDER CO,
    CUST_ORDER_LINE COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE; 

The above query executes very quickly (a second or less), and the execution plan for the above query looks like this:

Execution Plan
----------------------------------------------------------
Plan hash value: 3426513411

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |   205 | 17425 |   157  (21)| 00:00:01 |
|   1 |  SORT GROUP BY              |                 |   205 | 17425 |   157  (21)| 00:00:01 |
|*  2 |   FILTER                    |                 |       |       |            |          |
|*  3 |    HASH JOIN OUTER          |                 |   384 | 32640 |   156  (20)| 00:00:01 |
|   4 |     NESTED LOOPS            |                 |   194 |  9506 |    47  (35)| 00:00:01 |
|   5 |      VIEW                   |                 |  1993 | 35874 |    46  (35)| 00:00:01 |
|   6 |       SORT GROUP BY         |                 |  1993 | 39860 |    46  (35)| 00:00:01 |
|*  7 |        FILTER               |                 |       |       |            |          |
|   8 |         INDEX FAST FULL SCAN| EDI830_PK       |   107K|  2092K|    33  (10)| 00:00:01 |
|*  9 |      INDEX RANGE SCAN       | EDI830_LINES_PK |     1 |    31 |     1   (0)| 00:00:01 |
|  10 |     VIEW                    |                 |  6254 |   219K|   108  (13)| 00:00:01 |
|  11 |      SORT GROUP BY          |                 |  6254 |   250K|   108  (13)| 00:00:01 |
|* 12 |       HASH JOIN             |                 |  8213 |   328K|   106  (12)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL    | CUSTOMER_ORDER  |  7554 |   125K|    14   (8)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL    | CUST_ORDER_LINE |  8582 |   201K|    92  (12)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("EL"."FCDUEDATE">COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365) AND
              COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365)<SYSDATE@!+1200)
   3 - access("E"."CUSTPO"="CO"."CUSTOMER_PO_REF"(+))
   7 - filter(SYSDATE@!-365<=SYSDATE@!+1200)
   9 - access("E"."CUSTPO"="EL"."CUSTPO" AND "E"."RELID"="EL"."RELID" AND
              "EL"."FCDUEDATE">=SYSDATE@!-365 AND "EL"."FCSTQTY">0 AND
              "EL"."FCDUEDATE"<=SYSDATE@!+1200)
       filter("EL"."FCSTQTY">0)
  12 - access("CO"."ID"="COL"."CUST_ORDER_ID")
  13 - filter("CUSTOMER_PO_REF" IS NOT NULL)
  14 - filter("COL"."DESIRED_SHIP_DATE">=SYSDATE@!-365) 

Now, assume that we have a very simple table that currently contains no rows (the table was TRUNCATEd) and no indexes, and we would like to insert the rows returned by the above SQL statement into the table:

INSERT INTO
  T1
SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
...
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE; 

The INSERT INTO SQL statement required 20 to 30 minutes to complete.  Using SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,’TYPICAL’)); the following execution plan is displayed:

--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name            | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT             |                 |       |       |       |   279K(100)|          |
|   1 |  SORT GROUP BY               |                 |   205 | 13735 |       |   279K  (9)| 00:22:11 |
|*  2 |   FILTER                     |                 |       |       |       |            |          |
|   3 |    MERGE JOIN OUTER          |                 | 81363 |  5323K|       |   279K  (9)| 00:22:11 |
|   4 |     VIEW                     |                 | 70658 |  2139K|       |   279K  (9)| 00:22:10 |
|*  5 |      FILTER                  |                 |       |       |       |            |          |
|   6 |       SORT GROUP BY          |                 | 70658 |  3519K|  7169M|   279K  (9)| 00:22:10 |
|*  7 |        FILTER                |                 |       |       |       |            |          |
|*  8 |         HASH JOIN            |                 |   109M|  5343M|  3352K|   1154 (30)| 00:00:06 |
|   9 |          INDEX FAST FULL SCAN| EDI830_PK       |   107K|  2092K|       |     33 (10)| 00:00:01 |
|* 10 |          TABLE ACCESS FULL   | EDI830_LINES    |   242K|  7334K|       |    941 (35)| 00:00:05 |
|* 11 |     SORT JOIN                |                 |  6254 |   219K|       |    110 (15)| 00:00:01 |
|  12 |      VIEW                    |                 |  6254 |   219K|       |    108 (13)| 00:00:01 |
|  13 |       SORT GROUP BY          |                 |  6254 |   250K|       |    108 (13)| 00:00:01 |
|* 14 |        HASH JOIN             |                 |  8213 |   328K|       |    106 (12)| 00:00:01 |
|* 15 |         TABLE ACCESS FULL    | CUSTOMER_ORDER  |  7554 |   125K|       |     14  (8)| 00:00:01 |
|* 16 |         TABLE ACCESS FULL    | CUST_ORDER_LINE |  8582 |   201K|       |     92 (12)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
   2 - filter("E"."FCDUEDATE">COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365))
   5 - filter("EL"."RELID"=MAX("RELID"))
   7 - filter(SYSDATE@!-365<=SYSDATE@!+1200)
   8 - access("E"."CUSTPO"="EL"."CUSTPO")
  10 - filter(("EL"."FCDUEDATE">=SYSDATE@!-365 AND "EL"."FCSTQTY">0 AND
       "EL"."FCDUEDATE"<=SYSDATE@!+1200))
  11 - access("E"."CUSTPO"="CO"."CUSTOMER_PO_REF")
       filter("E"."CUSTPO"="CO"."CUSTOMER_PO_REF")
  14 - access("CO"."ID"="COL"."CUST_ORDER_ID")
  15 - filter("CUSTOMER_PO_REF" IS NOT NULL)
  16 - filter("COL"."DESIRED_SHIP_DATE">=SYSDATE@!-365) 

What happened, how would you investigate the problem, and how would you correct the problem?  Think about that for a couple of minutes before scrolling down.

Assume that I was able to obtain an execution plan for the INSERT INTO SQL statement that looked like this:

-------------------------------------------------------+-----------------------------------+
| Id  | Operation                     | Name           | Rows  | Bytes | Cost  | Time      |
-------------------------------------------------------+-----------------------------------+
| 0   | INSERT STATEMENT              |                |       |       |   157 |           |
| 1   |  SORT GROUP BY                |                |   194 |   13K |   157 |  00:00:01 |
| 2   |   FILTER                      |                |       |       |       |           |
| 3   |    HASH JOIN OUTER            |                |   223 |   15K |   156 |  00:00:01 |
| 4   |     VIEW                      |                |   194 |  6014 |    47 |  00:00:01 |
| 5   |      NESTED LOOPS             |                |   194 |  9118 |    47 |  00:00:01 |
| 6   |       VIEW                    |                |  1993 |   31K |    46 |  00:00:01 |
| 7   |        SORT GROUP BY          |                |  1993 |   39K |    46 |  00:00:01 |
| 8   |         FILTER                |                |       |       |       |           |
| 9   |          INDEX FAST FULL SCAN | EDI830_PK      |  105K | 2092K |    33 |  00:00:01 |
| 10  |       INDEX RANGE SCAN        | EDI830_LINES_PK|     1 |    31 |     1 |  00:00:01 |
| 11  |     VIEW                      |                |  6254 |  220K |   108 |  00:00:01 |
| 12  |      SORT GROUP BY            |                |  6254 |  250K |   108 |  00:00:01 |
| 13  |       HASH JOIN               |                |  8213 |  329K |   106 |  00:00:01 |
| 14  |        TABLE ACCESS FULL      | CUSTOMER_ORDER |  7554 |  125K |    14 |  00:00:01 |
| 15  |        TABLE ACCESS FULL      | CUST_ORDER_LINE|  8582 |  201K |    92 |  00:00:01 |
-------------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - filter("E"."FCDUEDATE">COALESCE("CO"."DESIRED_SHIP_DATE",SYSDATE@!-365))
3 - access("E"."CUSTPO"="CO"."CUSTOMER_PO_REF")
8 - filter(SYSDATE@!-365<=SYSDATE@!+1200)
10 - access("E"."CUSTPO"="EL"."CUSTPO" AND "E"."RELID"="EL"."RELID" AND "EL"."FCDUEDATE">=SYSDATE@!-365 AND "EL"."FCSTQTY">0 AND "EL"."FCDUEDATE"<=SYSDATE@!+1200)
10 - filter("EL"."FCSTQTY">0)
13 - access("CO"."ID"="COL"."CUST_ORDER_ID")
14 - filter("CUSTOMER_PO_REF" IS NOT NULL)
15 - filter("COL"."DESIRED_SHIP_DATE">=SYSDATE@!-365) 

Using the above execution plan, the INSERT INTO SQL statement completed in about a second (or less).  What did I do to fix the problem?

———

Here is what the sections of the query are intended to accomplish:

On a daily basis the header rows for custom order forecasts are inserted into the EDI830 table, and each order may be revised multiple times (possibly once a day or once a week), each time bumping the RELID value up by 1 or 2 (RELID is a zero padded number stored in a VARCHAR2 column).  This part of the query retrieves the most recent release number for each of the customer POs so that we are able to return the line item detail for the most recent customer order forecast release:

    (SELECT
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      EDI830 E
    GROUP BY
      E.CUSTPO) E 

Join the above inline view to the line item detail, with restrictions so that we do not look at releases from more than a year ago.  The LEADING hint is used to make certain that the optimizer first returns the rows from the EDI830 table before accessing the EDI830_LINES table (there is a chance that the optimizer will access the EDI830_LINES table first before returning the rows from the inline view):

  (SELECT /*+ LEADING(E) */
    E.CUSTPO,
    EL.FCSTQTY,
    EL.FCDUEDATE
  FROM
    (SELECT
      E.CUSTPO,
      MAX(RELID) RELID
    FROM
      EDI830 E
    GROUP BY
      E.CUSTPO) E,
    EDI830_LINES EL
  WHERE 
    E.CUSTPO=EL.CUSTPO
    AND E.RELID=EL.RELID
    AND EL.FCDUEDATE BETWEEN SYSDATE-365 AND SYSDATE+1200
    AND EL.FCSTQTY>0) E 

The goal of the following subquery is to return the ship date of the last order line that made it into the order entry system for each of the customer POs.  This subquery will allow us to determine which of the rows in the most recent EDI release are beyond the last firmed customer order line in the system (typically, only the firmed EDI release lines are brought into the CUST_ORDER_LINE table):

  (SELECT
    CO.CUSTOMER_PO_REF,
    MAX(COL.DESIRED_SHIP_DATE) DESIRED_SHIP_DATE,
    MAX(COL.PART_ID) PART_ID
  FROM
    CUSTOMER_ORDER CO,
    CUST_ORDER_LINE COL
  WHERE
    CO.ID=COL.CUST_ORDER_ID
    AND CUSTOMER_PO_REF IS NOT NULL
    AND COL.DESIRED_SHIP_DATE>=SYSDATE-365
  GROUP BY
    CO.CUSTOMER_PO_REF) CO

The outer-most portion of the SQL statement joins the two inline views together, using a LEADING hint to make certain that the EDI release rows are retrieved before accessing the CUSTOMER_ORDER and CUST_ORDER_LINE tables (even with the outer join, it is possible that those tables might be accessed first):

SELECT /*+ LEADING(E) */
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM') PART_ID,
  E.FCDUEDATE REQUIRED_DATE,
  SUM(E.FCSTQTY) QTY,
  1 PLAN_LEVEL
FROM
...
  GROUP BY
    CO.CUSTOMER_PO_REF) CO
WHERE
  E.CUSTPO=CO.CUSTOMER_PO_REF(+)
  AND E.FCDUEDATE>COALESCE(CO.DESIRED_SHIP_DATE,SYSDATE-365)
GROUP BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE
ORDER BY
  COALESCE(CO.PART_ID,'NOT-IN-SYSTEM'),
  E.FCDUEDATE;  

What happened?  What process would you use to investigate the performance problem?  How would you resolve this performance problem.

If you tell me to buy a bigger server, I will simply tell you to read a different book.  :-)








Follow

Get every new post delivered to your Inbox.

Join 137 other followers