Update Rows in Another Table with the Help of Analytic Functions

17 12 2009

December 17, 2009

In a recent comp.databases.oracle.misc Usenet thread:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/42c0d1550278250b

The following question was asked:

I need help with a query which involves the 2 tables defined below.  What I need to do is choose the record with the max “Eff Date” from “Table A” for a particular “Emp No.” and update the “Desc” from that record in the field “Desc” of “Table B” for the same “Emp No.”. I am able to choose the max “Eff Date” record for each employee from Table A but somehow not able to updated the same “Desc” in “Table B”.

Request you to please help the query. Any help would be appreciated.
Thanks!

Table A
Emp No. Group   Eff Date        Desc
1234    CI      01/01/1989      X
1234    CI      01/02/2000      X
1234    CI      01/02/2006      A
2345    AF      01/01/1990      X
2345    AF      01/02/2005      A

 

Table B
Emp No. Group   Desc
1234    CI      X
2345    AF      A
3456    CI      A

I provided the following suggestion:

Watch the query and results closely as one possible solution is built (there are other methods):

CREATE TABLE T1 (
  EMP_NO NUMBER,
  GROUPING VARCHAR2(5),
  EFF_DATE DATE,
  DESCR VARCHAR2(5));

CREATE TABLE T2 (
  EMP_NO NUMBER,
  GROUPING VARCHAR2(5),
  DESCR VARCHAR2(5));

INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/01/1989','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2000','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2006','MM/DD/YYYY'),'A');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/01/1990','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/02/2005','MM/DD/YYYY'),'A');

INSERT INTO T2 VALUES (1234,'CI','XNN');
INSERT INTO T2 VALUES (2345,'AF','ANN');
INSERT INTO T2 VALUES (3456,'CI','ANN');

COMMIT;

 

SELECT
  EMP_NO,
  GROUPING,
  DESCR
FROM
  T2;

EMP_NO GROUP DESCR
------ ----- -----
  1234 CI    XNN
  2345 AF    ANN
  3456 CI    ANN

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  DESCR
FROM
  T1;

EMP_NO GROUP EFF_DATE  DESCR
------ ----- --------- -----
  1234 CI    01-JAN-89 X
  1234 CI    02-JAN-00 X
  1234 CI    02-JAN-06 A
  2345 AF    01-JAN-90 X
  2345 AF    02-JAN-05 A

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE
DESC) RN,
  DESCR
FROM
  T1;

EMP_NO GROUP EFF_DATE          RN DESCR
------ ----- --------- ---------- -----
  1234 CI    02-JAN-06          1 A
  1234 CI    02-JAN-00          2 X
  1234 CI    01-JAN-89          3 X
  2345 AF    02-JAN-05          1 A
  2345 AF    01-JAN-90          2 X

 

SELECT
  EMP_NO,
  GROUPING,
  EFF_DATE,
  DESCR
FROM
  (SELECT
    EMP_NO,
    GROUPING,
    EFF_DATE,
    ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
    DESCR
  FROM
    T1)
WHERE
  RN=1;

EMP_NO GROUP EFF_DATE  DESCR
------ ----- --------- -----
  1234 CI    02-JAN-06 A
  2345 AF    02-JAN-05 A

 

UPDATE
  T2
SET
  DESCR=(
    SELECT
      DESCR
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1) T1
    WHERE
      RN=1
      AND T1.EMP_NO=T2.EMP_NO
      AND T1.GROUPING=T2.GROUPING)
WHERE
  (T2.EMP_NO,T2.GROUPING) IN (
    SELECT
      EMP_NO,
      GROUPING
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1)
    WHERE
      RN=1);

2 rows updated.

 

SELECT
  EMP_NO,
  GROUPING,
  DESCR
FROM
  T2;

EMP_NO GROUP DESCR
------ ----- -----
  1234 CI    A
  2345 AF    A
  3456 CI    ANN

 

Note that in the above, I assumed that the combination of EMP_NO and GROUPING had to be the same.

~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko provided a very different approach to solving the problem that is both compact and impressive:

SQL> merge into t2 t2
   2  using (
   3    select emp_no,grouping,
   4      max(descr) keep(dense_rank last order by eff_date) descr
   5      from t1 group by emp_no,grouping) t1
   6  on (t1.emp_no=t2.emp_no
   7  and t1.grouping=t2.grouping)
   8  when matched then update set t2.descr=t1.descr
   9  ;

2 rows merged.

 

Execution Plan
----------------------------------------------------------
Plan hash value: 3235844370

-------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | MERGE STATEMENT        |      |     2 |    16 |     8  (25)| 00:00:01 |
|   1 |  MERGE                 | T2   |       |       |            |          |
|   2 |   VIEW                 |      |       |       |            |          |
|*  3 |    HASH JOIN           |      |     2 |   108 |     8  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL  | T2   |     3 |    99 |     3   (0)| 00:00:01 |
|   5 |     VIEW               |      |     5 |   105 |     4  (25)| 00:00:01 |
|   6 |      SORT GROUP BY     |      |     5 |   150 |     4  (25)| 00:00:01 |
|   7 |       TABLE ACCESS FULL| T1   |     5 |   150 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------­---

Predicate Information (identified by operation id):
---------------------------------------------------
    3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND
               "T1"."GROUPING"="T2"."GROUPING")

~~~~~~~~~~~~~~~~~~~~~~~

Since Maxim provided an execution plan, let’s compare the efficiency of the two methods with a larger test case that uses the same table definitions:

TRUNCATE TABLE T1;
TRUNCATE TABLE T2;

INSERT INTO T1
SELECT
  DECODE(MOD(ROWNUM,10),
         0,0000,
         1,1111,
         2,2222,
         3,3333,
         4,4444,
         5,5555,
         6,6666,
         7,7777,
         8,8888,
         9,9999),
  DECODE(MOD(ROWNUM,6),
         0,'AA',
         1,'BB',
         2,'CC',
         3,'DD',
         4,'EE',
         5,'FF'),
  TRUNC(SYSDATE+SIN(ROWNUM/180*3.141592)*1000),
  UPPER(DBMS_RANDOM.STRING('A',1))
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

INSERT INTO T2
SELECT
  DECODE(MOD(ROWNUM,10),
         0,0000,
         1,1111,
         2,2222,
         3,3333,
         4,4444,
         5,5555,
         6,6,
         7,7,
         8,8,
         9,9),
  DECODE(MOD(ROWNUM,11),
         0,'AA',
         1,'BB',
         2,'CC',
         3,'DD',
         4,'EE',
         5,'FF',
         6,'GG',
         7,'HH',
         8,'II',
         9,'JJ',
         10,'KK'),
  UPPER(DBMS_RANDOM.STRING('A',3))
FROM
  DUAL
CONNECT BY
  LEVEL<=100;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)

 

SET PAGESIZE 1000
SET LINESIZE 150
ALTER SESSION SET STATISTICS_LEVEL='ALL';
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SPOOL C:\CHECKTIMING.TXT

UPDATE
  T2
SET
  DESCR=(
    SELECT
      DESCR
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1) T1
    WHERE
      RN=1
      AND T1.EMP_NO=T2.EMP_NO
      AND T1.GROUPING=T2.GROUPING)
WHERE
  (T2.EMP_NO,T2.GROUPING) IN (
    SELECT
      EMP_NO,
      GROUPING
    FROM
      (SELECT
        EMP_NO,
        GROUPING,
        ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,
        DESCR
      FROM
        T1)
    WHERE
      RN=1);

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SELECT
  *
FROM
  T2
WHERE
  LENGTH(DESCR)=1
ORDER BY
  EMP_NO;

ROLLBACK;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

merge into t2 t2
 using (
   select emp_no,grouping,
     max(descr) keep(dense_rank last order by eff_date) descr
     from t1 group by emp_no,grouping) t1
 on (t1.emp_no=t2.emp_no
 and t1.grouping=t2.grouping)
 when matched then update set t2.descr=t1.descr
 ;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

SELECT
  *
FROM
  T2
WHERE
  LENGTH(DESCR)=1
ORDER BY
  EMP_NO;

ROLLBACK;
SPOOL OFF

What is the output of the above?

SQL_ID  8w16pv37zxuh5, child number 0
-------------------------------------
UPDATE   T2 SET   DESCR=(     SELECT       DESCR     FROM       (SELECT
        EMP_NO,         GROUPING,         ROW_NUMBER() OVER (PARTITION
BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,         DESCR
FROM         T1) T1     WHERE       RN=1       AND T1.EMP_NO=T2.EMP_NO
     AND T1.GROUPING=T2.GROUPING) WHERE   (T2.EMP_NO,T2.GROUPING) IN (
   SELECT       EMP_NO,       GROUPING     FROM       (SELECT
EMP_NO,         GROUPING,         ROW_NUMBER() OVER (PARTITION BY
EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN,         DESCR       FROM
      T1)     WHERE       RN=1) 

Plan hash value: 2277482977                                                                                                                          

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT            |          |      1 |        |      0 |00:00:06.12 |   59000 |   3282 |      5 |       |       |          |
|   1 |  UPDATE                     | T2       |      1 |        |      0 |00:00:06.12 |   59000 |   3282 |      5 |       |       |          |
|*  2 |   HASH JOIN SEMI            |          |      1 |      1 |     17 |00:00:03.06 |    3289 |   3280 |      5 |   816K|   816K| 1167K (0)|
|   3 |    TABLE ACCESS FULL        | T2       |      1 |    100 |    100 |00:00:00.03 |       7 |      6 |      0 |       |       |          |
|   4 |    VIEW                     | VW_NSO_1 |      1 |   1000K|     30 |00:00:03.03 |    3282 |   3274 |      5 |       |       |          |
|*  5 |     VIEW                    |          |      1 |   1000K|     30 |00:00:03.03 |    3282 |   3274 |      5 |       |       |          |
|*  6 |      WINDOW SORT PUSHED RANK|          |      1 |   1000K|     67 |00:00:03.03 |    3282 |   3274 |      5 | 36864 | 36864 |   25M (1)|
|   7 |       TABLE ACCESS FULL     | T1       |      1 |   1000K|   1000K|00:00:00.01 |    3275 |   3269 |      0 |       |       |          |
|*  8 |   VIEW                      |          |     17 |  16667 |     17 |00:00:03.06 |   55675 |      0 |      0 |       |       |          |
|*  9 |    WINDOW SORT PUSHED RANK  |          |     17 |  16667 |    566K|00:00:02.53 |   55675 |      0 |      0 |  1541K|   615K| 1369K (0)|
|* 10 |     TABLE ACCESS FULL       | T1       |     17 |  16667 |    566K|00:00:01.13 |   55675 |      0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):                                                                                                  
---------------------------------------------------                                                                                                  
   2 - access("T2"."EMP_NO"="EMP_NO" AND "T2"."GROUPING"="GROUPING")
   5 - filter("RN"=1)
   6 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
   8 - filter("RN"=1)
   9 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
  10 - filter(("EMP_NO"=:B1 AND "GROUPING"=:B2))

    EMP_NO GROUP DESCR
---------- ----- -----
         0 CC    W   
         0 EE    F   
      1111 BB    F   
      1111 DD    U   
      1111 FF    Y   
      2222 AA    W   
      2222 CC    T   
      2222 EE    K   
      3333 BB    Z   
      3333 FF    I   
      3333 DD    W   
      4444 AA    Z   
      4444 EE    G   
      4444 CC    S   
      5555 DD    L   
      5555 BB    Y   
      5555 FF    X   

 

SQL_ID  93cj2ck69n4kg, child number 0
-------------------------------------
merge into t2 t2  using (    select emp_no,grouping,      max(descr)
keep(dense_rank last order by eff_date) descr      from t1 group by
emp_no,grouping) t1  on (t1.emp_no=t2.emp_no  and
t1.grouping=t2.grouping)  when matched then update set t2.descr=t1.descr

Plan hash value: 4231777338                                                                                                                          

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT        |      |      1 |        |      1 |00:00:01.53 |    3301 |   3277 |       |       |          |
|   1 |  MERGE                 | T2   |      1 |        |      1 |00:00:01.53 |    3301 |   3277 |       |       |          |
|   2 |   VIEW                 |      |      1 |        |     17 |00:00:01.50 |    3282 |   3275 |       |       |          |
|*  3 |    HASH JOIN           |      |      1 |     43 |     17 |00:00:01.50 |    3282 |   3275 |   921K|   921K| 1181K (0)|
|   4 |     VIEW               |      |      1 |     43 |     30 |00:00:01.47 |    3275 |   3269 |       |       |          |
|   5 |      SORT GROUP BY     |      |      1 |     43 |     30 |00:00:01.47 |    3275 |   3269 | 73728 | 73728 |          |
|   6 |       TABLE ACCESS FULL| T1   |      1 |   1000K|   1000K|00:00:00.01 |    3275 |   3269 |       |       |          |
|   7 |     TABLE ACCESS FULL  | T2   |      1 |    100 |    100 |00:00:00.03 |       7 |      6 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):                                                                                                  
---------------------------------------------------                                                                                                  
   3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND "T1"."GROUPING"="T2"."GROUPING") 

    EMP_NO GROUP DESCR
---------- ----- -----
         0 CC    Z   
         0 EE    Z   
      1111 BB    Z   
      1111 DD    X   
      1111 FF    Z   
      2222 AA    Z   
      2222 CC    Z   
      2222 EE    Z   
      3333 BB    Z   
      3333 FF    Z   
      3333 DD    Z   
      4444 AA    Z   
      4444 EE    Z   
      4444 CC    Z   
      5555 DD    Z   
      5555 BB    Z   
      5555 FF    Z   

From the point of view of performance, Maxim’s solution is a clear winner.  It is interesting to note that the value of the DESCR column in table T2 differs for the two approaches.