Sequence Driven Primary Keys – Which is Better: Call NextVal in the Insert Statement or in a Row Level Trigger?

25 03 2011

March 25, 2011 (Updated March 26, 2011)

Occasionally, I see interesting questions being asked about Oracle Database, and it is nice when the person asking the question provides a test case – or at the very least the DDL and DML statements needed to recreate an environment that matches the question.  The question that was asked in this instance is whether or not it is true that calling an Oracle sequence’s NEXTVAL property directly in an INSERT statement is faster/more efficient than relying on a row level trigger to set the primary key value from the Oracle sequence’s NEXTVAL property.  Just as important, a secondary question of “why” was asked.

A slightly modified version of the test script follows:

CREATE TABLE ST ( C1 NUMBER PRIMARY KEY, C2 VARCHAR2(20) );
CREATE SEQUENCE S1 START WITH 1 CACHE 1000;

CREATE OR REPLACE TRIGGER TR1
    BEFORE INSERT
    ON ST
    REFERENCING NEW AS NEW
    FOR EACH ROW
    BEGIN
    SELECT S1.NEXTVAL INTO :NEW.C1 FROM DUAL;
    END;
/   

SET TIMING ON

BEGIN
INSERT INTO ST(C2)
SELECT
     RPAD('A',11,'A')
  FROM
    DUAL
  CONNECT BY
    LEVEL <= 100000;
END;
/   

COMMIT;   

CREATE TABLE NVT ( C1 NUMBER PRIMARY KEY, C2 VARCHAR2(20));
CREATE SEQUENCE S2;

SET TIMING ON

BEGIN
INSERT INTO NVT(C1,C2)
SELECT S2.NEXTVAL,
     RPAD('A',11,'A')
  FROM
    DUAL
  CONNECT BY
    LEVEL <= 100000;
END;
/   

COMMIT;

DROP TABLE ST PURGE;
DROP TABLE NVT PURGE;
DROP SEQUENCE S1;
DROP SEQUENCE S2; 

The above appears to be a good starting point to determine if there is a difference in performance, and by roughly how much the elapsed time will be affected.  So, what were the timing results of the above script?

Row Level Trigger:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO ST(C2)
  3  SELECT
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 100000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.24  

Directly Calling NEXTVAL in the INSERT Statement:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO NVT(C1,C2)
  3  SELECT S2.NEXTVAL,
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 100000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.82  

Not too bad, 5.17 times faster in this particular test database (only 3.17 times faster in another ~ 7.79 seconds compared to 2.46 seconds) when directly accessing the sequence’s NEXTVAL property in the INSERT statement.  Is this a fair test case?  Take another look at the two sequence definitions.  Notice that the first sequence that is used by the trigger has a CACHE size of 1,000 while the second sequence has the default cache size of 20.  Does that really matter?  Let’s test to find out.  The replacement creation statement for the S2 sequence will be changed like this:

CREATE SEQUENCE S2 START WITH 1 CACHE 1000; 

Row Level Trigger:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO ST(C2)
  3  SELECT
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 100000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.32 

Directly Calling NEXTVAL in the INSERT Statement:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO NVT(C1,C2)
  3  SELECT S2.NEXTVAL,
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 100000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.40 

That helped to level the playing field a bit, now the direct method is 10.8 times faster than the row level trigger (only 7.29 times faster in another database instance ~ 7.73 seconds compared to 1.06 seconds).

There might still be a couple of questions about the test case – is it a fair test?  What if we increase the number of rows inserted by a factor of 10 (append another 0 on the number specified in the CONNECT BY LEVEL clause) and use the original script with the (originally specified) different CACHE values for the sequences?  Let’s take a look at the results:

Row Level Trigger:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO ST(C2)
  3  SELECT
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 1000000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:42.63 

Directly Calling NEXTVAL in the INSERT Statement:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO NVT(C1,C2)
  3  SELECT S2.NEXTVAL,
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 1000000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.08 

This change to the test case seemed to scale nearly linearly, 5.28 times faster in this particular test database (only 3.29 times faster in another database instance ~ 81.23 seconds compared to 24.70 seconds).   Let’s fix the sequence defiition for the second sequence and try the test again:

Row Level Trigger:

SQL> BEGIN
  2  INSERT INTO ST(C2)
  3  SELECT
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 1000000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:42.24 

Directly Calling NEXTVAL in the INSERT Statement:

SQL> SET TIMING ON
SQL>
SQL> BEGIN
  2  INSERT INTO NVT(C1,C2)
  3  SELECT S2.NEXTVAL,
  4       RPAD('A',11,'A')
  5    FROM
  6      DUAL
  7    CONNECT BY
  8      LEVEL <= 1000000;
  9  END;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.93 

Once again, having a suitable CACHE value for the sequence helped.  Now the direct method of specifying the primary key value is 10.75 times faster than the row level trigger (only 7.86 times faster in another database instance ~ 80.64 seconds compared to 10.26 seconds).

Are there still minor issues with the test case?  Possibly – how often are rows normally inserted into production database tables using the CONNECT BY LEVEL syntax?  Maybe expanding the test case to use an INSERT INTOSELECT type syntax would help further differentiate the two approaches.  Maybe expanding the test case to use 100,000 single row INSERT statements would be a reasonable reflection of a real-world test?  Maybe expanding the test case to include other operating system platforms would show a more significant difference in performance (the above test cases were performed on 64 bit Windows – context switching is typically less severe on Windows)?  Maybe expanding the test case to include other Oracle Database release versions would show different performance results (the above test cases were performed on Oracle Database 11.2.0.2 and 10.2.0.2)?  What about the monitoring the differences in the session’s CPU (or other resource) consumption?  What about monitoring the differences when multiple sessions concurrently perform inserts?

I do not have the complete answer why the method using the sequence’s NEXTVAL property in the INSERT statement is 10 times faster.  Context switching due to the row level trigger is certainly part of the cause, and I believe that is part of the reason for the longer execution time with the row level trigger.  What else could impact the execution time?  Think about some of the extra bits of low-level program instruction code that Oracle Database must execute to set up the PL/SQL environment so that the values that are to be inserted are copied into the NEW object.  Then the execution of the trigger code needs to open a cursor (with an initial hard parse, and the rest of the uses likely coming from the session’s cached cursors) to return the value of S1.NEXTVAL into the NEW object, add the cursor to the session’s cached cursors, and finally transfer the NEW object values into rows in the table blocks.  Think about what extra low-level program instruction code we simply avoid by just telling Oracle Database to fetch the S2.NEXTVAL value from the server’s memory.

Anyone else have a better explanation for the performance difference between the two methods of populating the primary key values?  Are there any good reasons why someone should use the row level trigger method to populate the primary key values?

———

Added March 26, 2011 in response to Narendra’s test case that is attached to this article.

I thought that it might be interesting to look at how changing from the CONNECT BY LEVEL syntax to single row inserts would affect the elapsed time, CPU utilization, redo generation, and number of current mode block gets.  As you will see, there is not a straight-forward answer – the effects are different for different Oracle release versions.  Here is the script that will be used:

SET ECHO ON

CREATE GLOBAL TEMPORARY TABLE TEMP_STAT ON COMMIT PRESERVE ROWS AS
SELECT
  0 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

CREATE TABLE T1 ( C1 NUMBER PRIMARY KEY, C2 VARCHAR2(20) );
CREATE SEQUENCE T1_SEQUENCE START WITH 1 CACHE 1000;

CREATE OR REPLACE TRIGGER T1_TRIGGER
    BEFORE INSERT
    ON T1
    REFERENCING NEW AS NEW
    FOR EACH ROW
    BEGIN
    SELECT T1_SEQUENCE.NEXTVAL INTO :NEW.C1 FROM DUAL;
    END;
/  

CREATE TABLE T2 ( C1 NUMBER PRIMARY KEY, C2 VARCHAR2(20));
CREATE SEQUENCE T2_SEQUENCE START WITH 1 CACHE 1000;

SET TIMING ON

INSERT INTO TEMP_STAT
SELECT
  1 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

DECLARE
  V VARCHAR2(11);
BEGIN
  V := RPAD('A',11,'A');
  FOR I IN 1 .. 1000000 LOOP
    INSERT INTO T1(C2) VALUES (V);
  END LOOP;
END;
/  

INSERT INTO TEMP_STAT
SELECT
  2 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

TRUNCATE TABLE T1;

INSERT INTO TEMP_STAT
SELECT
  3 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

BEGIN
INSERT INTO T1(C2)
  SELECT
     RPAD('A',11,'A')
  FROM
    DUAL
  CONNECT BY
    LEVEL <= 1000000;
END;
/

INSERT INTO TEMP_STAT
SELECT
  4 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

COMMIT;  

SET TIMING ON

INSERT INTO TEMP_STAT
SELECT
  5 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

DECLARE
  V VARCHAR2(11);
BEGIN
  V := RPAD('A',11,'A');
  FOR I IN 1 .. 1000000 LOOP
    INSERT INTO T2(C1,C2) VALUES (T2_SEQUENCE.NEXTVAL,V);
  END LOOP;
END;
/  

INSERT INTO TEMP_STAT
SELECT
  6 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

TRUNCATE TABLE T2;

INSERT INTO TEMP_STAT
SELECT
  7 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

BEGIN
INSERT INTO T2(C1,C2)
SELECT T2_SEQUENCE.NEXTVAL,
     RPAD('A',11,'A')
  FROM
    DUAL
  CONNECT BY
    LEVEL <= 1000000;
END;
/

INSERT INTO TEMP_STAT
SELECT
  8 CNT,
  SN.NAME,
  MS.VALUE
FROM
  V$STATNAME SN,
  V$MYSTAT MS
WHERE
  SN.STATISTIC#=MS.STATISTIC#
  AND SN.NAME IN ('CPU used by this session','consistent gets','db block gets','parse count (hard)','parse count (total)','recursive calls','redo size','DB time');

COMMIT;

DROP TABLE T1 PURGE;
DROP TABLE T2 PURGE;
DROP SEQUENCE T1_SEQUENCE;
DROP SEQUENCE T2_SEQUENCE;

COLUMN NAME FORMAT A28
COLUMN DELTA FORMAT 999,999,990
SET PAGESIZE 1000

SELECT
  TS2.CNT,
  TS2.NAME,
  TS2.VALUE-TS1.VALUE DELTA
FROM
  TEMP_STAT TS1,
  TEMP_STAT TS2
WHERE
  TS2.CNT=TS1.CNT+1
  AND TS2.NAME=TS1.NAME
ORDER BY
  TS2.CNT,
  TS2.NAME;

TRUNCATE TABLE TEMP_STAT; 

As you can see from the script, the script will periodically capture several statistics to a global temporary table, and at the end it will output the delta values of the statistics.  From the delta values, we will be interested in the statistics from CNT 2 (SINGLE ROW WITH TRIGGER), CNT 4 (CONNECT BY WITH TRIGGER), CNT 6 (SINGLE ROW WITHOUT TRIGGER), and CNT 8 (CONNECT BY WITHOUT TRIGGER).

I executed the script on Oracle Database 10.2.0.5 (64 bit Windows), 11.1.0.7 (64 bit Windows), 11.2.0.1 (64 bit Windows), 11.1.0.6 (64 bit Linux), and 11.2.0.1 (64 bit Linux) using a systems with identical hardware for both operating system platforms.  I also executed the script on 11.2.0.2 (64 bit Windows – different server #2), and 10.2.0.2 (64 bit Windows – different server #3).  The results were interesting – take a close look at the results for 11.1.0.6 running on 64 bit Linux:








Follow

Get every new post delivered to your Inbox.

Join 141 other followers