Update Rows in Another Table with the Help of Analytic Functions

17 12 2009

December 17, 2009

In a recent comp.databases.oracle.misc Usenet thread:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/42c0d1550278250b

The following question was asked:

I need help with a query which involves the 2 tables defined below.  What I need to do is choose the record with the max “Eff Date” from “Table A” for a particular “Emp No.” and update the “Desc” from that record in the field “Desc” of “Table B” for the same “Emp No.”. I am able to choose the max “Eff Date” record for each employee from Table A but somehow not able to updated the same “Desc” in “Table B”.

Request you to please help the query. Any help would be appreciated.
Thanks!

Table A
Emp No. Group   Eff Date        Desc
1234    CI      01/01/1989      X
1234    CI      01/02/2000      X
1234    CI      01/02/2006      A
2345    AF      01/01/1990      X
2345    AF      01/02/2005      A

 

Table B
Emp No. Group   Desc
1234    CI      X
2345    AF      A
3456    CI      A

I provided the following suggestion:

Watch the query and results closely as one possible solution is built (there are other methods):

CREATE TABLE T1 (
  EMP_NO NUMBER,
  GROUPING VARCHAR2(5),
  EFF_DATE DATE,
  DESCR VARCHAR2(5));

CREATE TABLE T2 (
  EMP_NO NUMBER,
  GROUPING VARCHAR2(5),
  DESCR VARCHAR2(5));

INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/01/1989','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2000','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2006','MM/DD/YYYY'),'A');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/01/1990','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/02/2005','MM/DD/YYYY'),'A');

INSERT INTO T2 VALUES (1234,'CI','XNN');
INSERT INTO T2 VALUES (2345,'AF','ANN');
INSERT INTO T2 VALUES (3456,'CI','ANN');

COMMIT;

 

SELECT
  EMP_NO,
  GROUPING,
  DESCR
FROM
  T2;

EMP_NO GROUP DESCR
------ ----- -----
  1234 CI    XNN
  2345 AF    ANN
  3456 CI    ANN

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  DESCR
FROM
  T1;

EMP_NO GROUP EFF_DATE  DESCR
------ ----- --------- -----
  1234 CI    01-JAN-89 X
  1234 CI    02-JAN-00 X
  1234 CI    02-JAN-06 A
  2345 AF    01-JAN-90 X
  2345 AF    02-JAN-05 A

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE
DESC) RN,
  DESCR
FROM
  T1;

EMP_NO GROUP EFF_DATE          RN DESCR
------ ----- --------- ---------- -----
  1234 CI    02-JAN-06          1 A
  1234 CI    02-JAN-00          2 X
  1234 CI    01-JAN-89          3 X
  2345 AF    02-JAN-05          1 A
  2345 AF    01-JAN-90          2 X

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  DESCR
FROM
  (SELECT
    EMP_NO,
    GROUPING,
    EFF_DATE,
    ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
    DESCR
  FROM
    T1)
WHERE
  RN=1;

EMP_NO GROUP EFF_DATE  DESCR
------ ----- --------- -----
  1234 CI    02-JAN-06 A
  2345 AF    02-JAN-05 A

 

UPDATE
  T2
SET
  DESCR=(
    SELECT
      DESCR
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1) T1
    WHERE
      RN=1
      AND T1.EMP_NO=T2.EMP_NO
      AND T1.GROUPING=T2.GROUPING)
WHERE
  (T2.EMP_NO,T2.GROUPING) IN (
    SELECT
      EMP_NO,
      GROUPING
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1)
    WHERE
      RN=1);

2 rows updated.

 

SELECT
  EMP_NO,
  GROUPING,
  DESCR
FROM
  T2;

EMP_NO GROUP DESCR
------ ----- -----
  1234 CI    A
  2345 AF    A
  3456 CI    ANN

 

Note that in the above, I assumed that the combination of EMP_NO and GROUPING had to be the same.

~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko provided a very different approach to solving the problem that is both compact and impressive:

SQL> merge into t2 t2
   2  using (
   3    select emp_no,grouping,
   4      max(descr) keep(dense_rank last order by eff_date) descr
   5      from t1 group by emp_no,grouping) t1
   6  on (t1.emp_no=t2.emp_no
   7  and t1.grouping=t2.grouping)
   8  when matched then update set t2.descr=t1.descr
   9  ;

2 rows merged.

 

Execution Plan
----------------------------------------------------------
Plan hash value: 3235844370

-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | MERGE STATEMENT        |      |     2 |    16 |     8  (25)| 00:00:01 |
|   1 |  MERGE                 | T2   |       |       |            |          |
|   2 |   VIEW                 |      |       |       |            |          |
|*  3 |    HASH JOIN           |      |     2 |   108 |     8  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL  | T2   |     3 |    99 |     3   (0)| 00:00:01 |
|   5 |     VIEW               |      |     5 |   105 |     4  (25)| 00:00:01 |
|   6 |      SORT GROUP BY     |      |     5 |   150 |     4  (25)| 00:00:01 |
|   7 |       TABLE ACCESS FULL| T1   |     5 |   150 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------­---

Predicate Information (identified by operation id):
---------------------------------------------------
    3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND
               "T1"."GROUPING"="T2"."GROUPING")

~~~~~~~~~~~~~~~~~~~~~~~

Since Maxim provided an execution plan, let’s compare the efficiency of the two methods with a larger test case that uses the same table definitions:

TRUNCATE TABLE T1;
TRUNCATE TABLE T2;

INSERT INTO T1
SELECT
  DECODE(MOD(ROWNUM,10),
         0,0000,
         1,1111,
         2,2222,
         3,3333,
         4,4444,
         5,5555,
         6,6666,
         7,7777,
         8,8888,
         9,9999),
  DECODE(MOD(ROWNUM,6),
         0,'AA',
         1,'BB',
         2,'CC',
         3,'DD',
         4,'EE',
         5,'FF'),
  TRUNC(SYSDATE+SIN(ROWNUM/180*3.141592)*1000),
  UPPER(DBMS_RANDOM.STRING('A',1))
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

INSERT INTO T2
SELECT
  DECODE(MOD(ROWNUM,10),
         0,0000,
         1,1111,
         2,2222,
         3,3333,
         4,4444,
         5,5555,
         6,6,
         7,7,
         8,8,
         9,9),
  DECODE(MOD(ROWNUM,11),
         0,'AA',
         1,'BB',
         2,'CC',
         3,'DD',
         4,'EE',
         5,'FF',
         6,'GG',
         7,'HH',
         8,'II',
         9,'JJ',
         10,'KK'),
  UPPER(DBMS_RANDOM.STRING('A',3))
FROM
  DUAL
CONNECT BY
  LEVEL<=100;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)

 

SET PAGESIZE 1000
SET LINESIZE 150
ALTER SESSION SET STATISTICS_LEVEL='ALL';
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SPOOL C:\CHECKTIMING.TXT

UPDATE
  T2
SET
  DESCR=(
    SELECT
      DESCR
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1) T1
    WHERE
      RN=1
      AND T1.EMP_NO=T2.EMP_NO
      AND T1.GROUPING=T2.GROUPING)
WHERE
  (T2.EMP_NO,T2.GROUPING) IN (
    SELECT
      EMP_NO,
      GROUPING
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1)
    WHERE
      RN=1);

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SELECT
  *
FROM
  T2
WHERE
  LENGTH(DESCR)=1
ORDER BY
  EMP_NO;

ROLLBACK;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

merge into t2 t2
 using (
   select emp_no,grouping,
     max(descr) keep(dense_rank last order by eff_date) descr
     from t1 group by emp_no,grouping) t1
 on (t1.emp_no=t2.emp_no
 and t1.grouping=t2.grouping)
 when matched then update set t2.descr=t1.descr
 ;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SELECT
  *
FROM
  T2
WHERE
  LENGTH(DESCR)=1
ORDER BY
  EMP_NO;

ROLLBACK;
SPOOL OFF

What is the output of the above?

SQL_ID  8w16pv37zxuh5, child number 0
-------------------------------------
UPDATE   T2 SET   DESCR=(     SELECT       DESCR     FROM       (SELECT
        EMP_NO,         GROUPING,         ROW_NUMBER() OVER (PARTITION
BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,         DESCR
FROM         T1) T1     WHERE       RN=1       AND T1.EMP_NO=T2.EMP_NO
     AND T1.GROUPING=T2.GROUPING) WHERE   (T2.EMP_NO,T2.GROUPING) IN (
   SELECT       EMP_NO,       GROUPING     FROM       (SELECT
EMP_NO,         GROUPING,         ROW_NUMBER() OVER (PARTITION BY
EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,         DESCR       FROM
      T1)     WHERE       RN=1) 

Plan hash value: 2277482977                                                                                                                          

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT            |          |      1 |        |      0 |00:00:06.12 |   59000 |   3282 |      5 |       |       |          |
|   1 |  UPDATE                     | T2       |      1 |        |      0 |00:00:06.12 |   59000 |   3282 |      5 |       |       |          |
|*  2 |   HASH JOIN SEMI            |          |      1 |      1 |     17 |00:00:03.06 |    3289 |   3280 |      5 |   816K|   816K| 1167K (0)|
|   3 |    TABLE ACCESS FULL        | T2       |      1 |    100 |    100 |00:00:00.03 |       7 |      6 |      0 |       |       |          |
|   4 |    VIEW                     | VW_NSO_1 |      1 |   1000K|     30 |00:00:03.03 |    3282 |   3274 |      5 |       |       |          |
|*  5 |     VIEW                    |          |      1 |   1000K|     30 |00:00:03.03 |    3282 |   3274 |      5 |       |       |          |
|*  6 |      WINDOW SORT PUSHED RANK|          |      1 |   1000K|     67 |00:00:03.03 |    3282 |   3274 |      5 | 36864 | 36864 |   25M (1)|
|   7 |       TABLE ACCESS FULL     | T1       |      1 |   1000K|   1000K|00:00:00.01 |    3275 |   3269 |      0 |       |       |          |
|*  8 |   VIEW                      |          |     17 |  16667 |     17 |00:00:03.06 |   55675 |      0 |      0 |       |       |          |
|*  9 |    WINDOW SORT PUSHED RANK  |          |     17 |  16667 |    566K|00:00:02.53 |   55675 |      0 |      0 |  1541K|   615K| 1369K (0)|
|* 10 |     TABLE ACCESS FULL       | T1       |     17 |  16667 |    566K|00:00:01.13 |   55675 |      0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):                                                                                                  
---------------------------------------------------                                                                                                  
   2 - access("T2"."EMP_NO"="EMP_NO" AND "T2"."GROUPING"="GROUPING")
   5 - filter("RN"=1)
   6 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
   8 - filter("RN"=1)
   9 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
  10 - filter(("EMP_NO"=:B1 AND "GROUPING"=:B2))

    EMP_NO GROUP DESCR
---------- ----- -----
         0 CC    W   
         0 EE    F   
      1111 BB    F   
      1111 DD    U   
      1111 FF    Y   
      2222 AA    W   
      2222 CC    T   
      2222 EE    K   
      3333 BB    Z   
      3333 FF    I   
      3333 DD    W   
      4444 AA    Z   
      4444 EE    G   
      4444 CC    S   
      5555 DD    L   
      5555 BB    Y   
      5555 FF    X   

 

SQL_ID  93cj2ck69n4kg, child number 0
-------------------------------------
merge into t2 t2  using (    select emp_no,grouping,      max(descr)
keep(dense_rank last order by eff_date) descr      from t1 group by
emp_no,grouping) t1  on (t1.emp_no=t2.emp_no  and
t1.grouping=t2.grouping)  when matched then update set t2.descr=t1.descr

Plan hash value: 4231777338                                                                                                                          

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT        |      |      1 |        |      1 |00:00:01.53 |    3301 |   3277 |       |       |          |
|   1 |  MERGE                 | T2   |      1 |        |      1 |00:00:01.53 |    3301 |   3277 |       |       |          |
|   2 |   VIEW                 |      |      1 |        |     17 |00:00:01.50 |    3282 |   3275 |       |       |          |
|*  3 |    HASH JOIN           |      |      1 |     43 |     17 |00:00:01.50 |    3282 |   3275 |   921K|   921K| 1181K (0)|
|   4 |     VIEW               |      |      1 |     43 |     30 |00:00:01.47 |    3275 |   3269 |       |       |          |
|   5 |      SORT GROUP BY     |      |      1 |     43 |     30 |00:00:01.47 |    3275 |   3269 | 73728 | 73728 |          |
|   6 |       TABLE ACCESS FULL| T1   |      1 |   1000K|   1000K|00:00:00.01 |    3275 |   3269 |       |       |          |
|   7 |     TABLE ACCESS FULL  | T2   |      1 |    100 |    100 |00:00:00.03 |       7 |      6 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):                                                                                                  
---------------------------------------------------                                                                                                  
   3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND "T1"."GROUPING"="T2"."GROUPING") 

    EMP_NO GROUP DESCR
---------- ----- -----
         0 CC    Z   
         0 EE    Z   
      1111 BB    Z   
      1111 DD    X   
      1111 FF    Z   
      2222 AA    Z   
      2222 CC    Z   
      2222 EE    Z   
      3333 BB    Z   
      3333 FF    Z   
      3333 DD    Z   
      4444 AA    Z   
      4444 EE    Z   
      4444 CC    Z   
      5555 DD    Z   
      5555 BB    Z   
      5555 FF    Z   

From the point of view of performance, Maxim’s solution is a clear winner.  It is interesting to note that the value of the DESCR column in table T2 differs for the two approaches.


Actions

Information

2 responses

17 12 2009
Maxim Demenko

Charles, excellent news – you’ve started to blog – somehow, i expected (hoped) it since a long time ;-)
Thank you for that detailed testcase – i think, the performance difference can be explained by the fact, that merge statement has less to read ( t1 is accessed one time less) and less to sort ( window sort can be costly on big datasets) as compared to update statement. Regarding different results – i can’t test it at the moment, but most likely, the sort order can’t uniquely identify rows in your test data (that means, there should be many rows with the same emp_no , grouping and eff_date and different descr values). If that is true, then both, row_number and first/last can pick up any descr from the set of rows, in this case the sort key should be extended to identifiy rows uniquely. Btw, the approach to use merge instead of update in similar situations i seen first time somewhere on AskTom.

Best regards

Maxim

17 12 2009
Charles Hooper

Maxim,
A couple people from the OTN forums convinced me to set up this blog – primarily just to reference some of the test cases and demonstrations that have been set up over the years. If I remember correctly, your posts on comp.databases.oracle.* are what finally pushed me to start learning about analytic functions a couple years ago. I look forward to seeing your future contributions in the comp.databases.oracle.* groups.

(More of Maxim’s solutions may be found in the “SQL – Methods of Reformatting into Equivalent Forms” blog series.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 142 other followers

%d bloggers like this: