December 17, 2009
In a recent comp.databases.oracle.misc Usenet thread:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/42c0d1550278250b
The following question was asked:
I need help with a query which involves the 2 tables defined below. What I need to do is choose the record with the max “Eff Date” from “Table A” for a particular “Emp No.” and update the “Desc” from that record in the field “Desc” of “Table B” for the same “Emp No.”. I am able to choose the max “Eff Date” record for each employee from Table A but somehow not able to updated the same “Desc” in “Table B”.
Request you to please help the query. Any help would be appreciated.
Thanks!Table A Emp No. Group Eff Date Desc 1234 CI 01/01/1989 X 1234 CI 01/02/2000 X 1234 CI 01/02/2006 A 2345 AF 01/01/1990 X 2345 AF 01/02/2005 A
Table B Emp No. Group Desc 1234 CI X 2345 AF A 3456 CI A
I provided the following suggestion:
Watch the query and results closely as one possible solution is built (there are other methods):
CREATE TABLE T1 (
EMP_NO NUMBER,
GROUPING VARCHAR2(5),
EFF_DATE DATE,
DESCR VARCHAR2(5));
CREATE TABLE T2 (
EMP_NO NUMBER,
GROUPING VARCHAR2(5),
DESCR VARCHAR2(5));
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/01/1989','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2000','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (1234,'CI',TO_DATE('01/02/2006','MM/DD/YYYY'),'A');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/01/1990','MM/DD/YYYY'),'X');
INSERT INTO T1 VALUES (2345,'AF',TO_DATE('01/02/2005','MM/DD/YYYY'),'A');
INSERT INTO T2 VALUES (1234,'CI','XNN');
INSERT INTO T2 VALUES (2345,'AF','ANN');
INSERT INTO T2 VALUES (3456,'CI','ANN');
COMMIT;
SELECT EMP_NO, GROUPING, DESCR FROM T2; EMP_NO GROUP DESCR ------ ----- ----- 1234 CI XNN 2345 AF ANN 3456 CI ANN
SELECT EMP_NO, GROUPING, EFF_DATE, DESCR FROM T1; EMP_NO GROUP EFF_DATE DESCR ------ ----- --------- ----- 1234 CI 01-JAN-89 X 1234 CI 02-JAN-00 X 1234 CI 02-JAN-06 A 2345 AF 01-JAN-90 X 2345 AF 02-JAN-05 A
SELECT EMP_NO, GROUPING, EFF_DATE, ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM T1; EMP_NO GROUP EFF_DATE RN DESCR ------ ----- --------- ---------- ----- 1234 CI 02-JAN-06 1 A 1234 CI 02-JAN-00 2 X 1234 CI 01-JAN-89 3 X 2345 AF 02-JAN-05 1 A 2345 AF 01-JAN-90 2 X
SELECT EMP_NO, GROUPING, EFF_DATE, DESCR FROM (SELECT EMP_NO, GROUPING, EFF_DATE, ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM T1) WHERE RN=1; EMP_NO GROUP EFF_DATE DESCR ------ ----- --------- ----- 1234 CI 02-JAN-06 A 2345 AF 02-JAN-05 A
UPDATE T2 SET DESCR=( SELECT DESCR FROM (SELECT EMP_NO, GROUPING, ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM T1) T1 WHERE RN=1 AND T1.EMP_NO=T2.EMP_NO AND T1.GROUPING=T2.GROUPING) WHERE (T2.EMP_NO,T2.GROUPING) IN ( SELECT EMP_NO, GROUPING FROM (SELECT EMP_NO, GROUPING, ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM T1) WHERE RN=1); 2 rows updated.
SELECT EMP_NO, GROUPING, DESCR FROM T2; EMP_NO GROUP DESCR ------ ----- ----- 1234 CI A 2345 AF A 3456 CI ANN
Note that in the above, I assumed that the combination of EMP_NO and GROUPING had to be the same.
~~~~~~~~~~~~~~~~~~~~~~~
Maxim Demenko provided a very different approach to solving the problem that is both compact and impressive:
SQL> merge into t2 t2 2 using ( 3 select emp_no,grouping, 4 max(descr) keep(dense_rank last order by eff_date) descr 5 from t1 group by emp_no,grouping) t1 6 on (t1.emp_no=t2.emp_no 7 and t1.grouping=t2.grouping) 8 when matched then update set t2.descr=t1.descr 9 ; 2 rows merged.
Execution Plan
----------------------------------------------------------
Plan hash value: 3235844370
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 2 | 16 | 8 (25)| 00:00:01 |
| 1 | MERGE | T2 | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN | | 2 | 108 | 8 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T2 | 3 | 99 | 3 (0)| 00:00:01 |
| 5 | VIEW | | 5 | 105 | 4 (25)| 00:00:01 |
| 6 | SORT GROUP BY | | 5 | 150 | 4 (25)| 00:00:01 |
| 7 | TABLE ACCESS FULL| T1 | 5 | 150 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND
"T1"."GROUPING"="T2"."GROUPING")
~~~~~~~~~~~~~~~~~~~~~~~
Since Maxim provided an execution plan, let’s compare the efficiency of the two methods with a larger test case that uses the same table definitions:
TRUNCATE TABLE T1;
TRUNCATE TABLE T2;
INSERT INTO T1
SELECT
DECODE(MOD(ROWNUM,10),
0,0000,
1,1111,
2,2222,
3,3333,
4,4444,
5,5555,
6,6666,
7,7777,
8,8888,
9,9999),
DECODE(MOD(ROWNUM,6),
0,'AA',
1,'BB',
2,'CC',
3,'DD',
4,'EE',
5,'FF'),
TRUNC(SYSDATE+SIN(ROWNUM/180*3.141592)*1000),
UPPER(DBMS_RANDOM.STRING('A',1))
FROM
DUAL
CONNECT BY
LEVEL<=1000000;
INSERT INTO T2
SELECT
DECODE(MOD(ROWNUM,10),
0,0000,
1,1111,
2,2222,
3,3333,
4,4444,
5,5555,
6,6,
7,7,
8,8,
9,9),
DECODE(MOD(ROWNUM,11),
0,'AA',
1,'BB',
2,'CC',
3,'DD',
4,'EE',
5,'FF',
6,'GG',
7,'HH',
8,'II',
9,'JJ',
10,'KK'),
UPPER(DBMS_RANDOM.STRING('A',3))
FROM
DUAL
CONNECT BY
LEVEL<=100;
COMMIT;
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)
SET PAGESIZE 1000 SET LINESIZE 150 ALTER SESSION SET STATISTICS_LEVEL='ALL'; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; SPOOL C:\CHECKTIMING.TXT UPDATE T2 SET DESCR=( SELECT DESCR FROM (SELECT EMP_NO, GROUPING, ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM T1) T1 WHERE RN=1 AND T1.EMP_NO=T2.EMP_NO AND T1.GROUPING=T2.GROUPING) WHERE (T2.EMP_NO,T2.GROUPING) IN ( SELECT EMP_NO, GROUPING FROM (SELECT EMP_NO, GROUPING, ROW_NUMBER() OVER (PARTITION BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM T1) WHERE RN=1); SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); SELECT * FROM T2 WHERE LENGTH(DESCR)=1 ORDER BY EMP_NO; ROLLBACK; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; merge into t2 t2 using ( select emp_no,grouping, max(descr) keep(dense_rank last order by eff_date) descr from t1 group by emp_no,grouping) t1 on (t1.emp_no=t2.emp_no and t1.grouping=t2.grouping) when matched then update set t2.descr=t1.descr ; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST')); SELECT * FROM T2 WHERE LENGTH(DESCR)=1 ORDER BY EMP_NO; ROLLBACK; SPOOL OFF
What is the output of the above?
SQL_ID 8w16pv37zxuh5, child number 0
-------------------------------------
UPDATE T2 SET DESCR=( SELECT DESCR FROM (SELECT
EMP_NO, GROUPING, ROW_NUMBER() OVER (PARTITION
BY EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR
FROM T1) T1 WHERE RN=1 AND T1.EMP_NO=T2.EMP_NO
AND T1.GROUPING=T2.GROUPING) WHERE (T2.EMP_NO,T2.GROUPING) IN (
SELECT EMP_NO, GROUPING FROM (SELECT
EMP_NO, GROUPING, ROW_NUMBER() OVER (PARTITION BY
EMP_NO, GROUPING ORDER BY EFF_DATE DESC) RN, DESCR FROM
T1) WHERE RN=1)
Plan hash value: 2277482977
-----------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | | 0 |00:00:06.12 | 59000 | 3282 | 5 | | | |
| 1 | UPDATE | T2 | 1 | | 0 |00:00:06.12 | 59000 | 3282 | 5 | | | |
|* 2 | HASH JOIN SEMI | | 1 | 1 | 17 |00:00:03.06 | 3289 | 3280 | 5 | 816K| 816K| 1167K (0)|
| 3 | TABLE ACCESS FULL | T2 | 1 | 100 | 100 |00:00:00.03 | 7 | 6 | 0 | | | |
| 4 | VIEW | VW_NSO_1 | 1 | 1000K| 30 |00:00:03.03 | 3282 | 3274 | 5 | | | |
|* 5 | VIEW | | 1 | 1000K| 30 |00:00:03.03 | 3282 | 3274 | 5 | | | |
|* 6 | WINDOW SORT PUSHED RANK| | 1 | 1000K| 67 |00:00:03.03 | 3282 | 3274 | 5 | 36864 | 36864 | 25M (1)|
| 7 | TABLE ACCESS FULL | T1 | 1 | 1000K| 1000K|00:00:00.01 | 3275 | 3269 | 0 | | | |
|* 8 | VIEW | | 17 | 16667 | 17 |00:00:03.06 | 55675 | 0 | 0 | | | |
|* 9 | WINDOW SORT PUSHED RANK | | 17 | 16667 | 566K|00:00:02.53 | 55675 | 0 | 0 | 1541K| 615K| 1369K (0)|
|* 10 | TABLE ACCESS FULL | T1 | 17 | 16667 | 566K|00:00:01.13 | 55675 | 0 | 0 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T2"."EMP_NO"="EMP_NO" AND "T2"."GROUPING"="GROUPING")
5 - filter("RN"=1)
6 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
8 - filter("RN"=1)
9 - filter(ROW_NUMBER() OVER ( PARTITION BY "EMP_NO","GROUPING" ORDER BY INTERNAL_FUNCTION("EFF_DATE") DESC )<=1)
10 - filter(("EMP_NO"=:B1 AND "GROUPING"=:B2))
EMP_NO GROUP DESCR
---------- ----- -----
0 CC W
0 EE F
1111 BB F
1111 DD U
1111 FF Y
2222 AA W
2222 CC T
2222 EE K
3333 BB Z
3333 FF I
3333 DD W
4444 AA Z
4444 EE G
4444 CC S
5555 DD L
5555 BB Y
5555 FF X
SQL_ID 93cj2ck69n4kg, child number 0
-------------------------------------
merge into t2 t2 using ( select emp_no,grouping, max(descr)
keep(dense_rank last order by eff_date) descr from t1 group by
emp_no,grouping) t1 on (t1.emp_no=t2.emp_no and
t1.grouping=t2.grouping) when matched then update set t2.descr=t1.descr
Plan hash value: 4231777338
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 1 | | 1 |00:00:01.53 | 3301 | 3277 | | | |
| 1 | MERGE | T2 | 1 | | 1 |00:00:01.53 | 3301 | 3277 | | | |
| 2 | VIEW | | 1 | | 17 |00:00:01.50 | 3282 | 3275 | | | |
|* 3 | HASH JOIN | | 1 | 43 | 17 |00:00:01.50 | 3282 | 3275 | 921K| 921K| 1181K (0)|
| 4 | VIEW | | 1 | 43 | 30 |00:00:01.47 | 3275 | 3269 | | | |
| 5 | SORT GROUP BY | | 1 | 43 | 30 |00:00:01.47 | 3275 | 3269 | 73728 | 73728 | |
| 6 | TABLE ACCESS FULL| T1 | 1 | 1000K| 1000K|00:00:00.01 | 3275 | 3269 | | | |
| 7 | TABLE ACCESS FULL | T2 | 1 | 100 | 100 |00:00:00.03 | 7 | 6 | | | |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T1"."EMP_NO"="T2"."EMP_NO" AND "T1"."GROUPING"="T2"."GROUPING")
EMP_NO GROUP DESCR
---------- ----- -----
0 CC Z
0 EE Z
1111 BB Z
1111 DD X
1111 FF Z
2222 AA Z
2222 CC Z
2222 EE Z
3333 BB Z
3333 FF Z
3333 DD Z
4444 AA Z
4444 EE Z
4444 CC Z
5555 DD Z
5555 BB Z
5555 FF Z
From the point of view of performance, Maxim’s solution is a clear winner. It is interesting to note that the value of the DESCR column in table T2 differs for the two approaches.

Charles, excellent news – you’ve started to blog – somehow, i expected (hoped) it since a long time
Thank you for that detailed testcase – i think, the performance difference can be explained by the fact, that merge statement has less to read ( t1 is accessed one time less) and less to sort ( window sort can be costly on big datasets) as compared to update statement. Regarding different results – i can’t test it at the moment, but most likely, the sort order can’t uniquely identify rows in your test data (that means, there should be many rows with the same emp_no , grouping and eff_date and different descr values). If that is true, then both, row_number and first/last can pick up any descr from the set of rows, in this case the sort key should be extended to identifiy rows uniquely. Btw, the approach to use merge instead of update in similar situations i seen first time somewhere on AskTom.
Best regards
Maxim
Maxim,
A couple people from the OTN forums convinced me to set up this blog – primarily just to reference some of the test cases and demonstrations that have been set up over the years. If I remember correctly, your posts on comp.databases.oracle.* are what finally pushed me to start learning about analytic functions a couple years ago. I look forward to seeing your future contributions in the comp.databases.oracle.* groups.
(More of Maxim’s solutions may be found in the “SQL – Methods of Reformatting into Equivalent Forms” blog series.)