Oracle Enqueue Rules Leading to Deadlock Changes Again in 12.1?

1 08 2013

August 1, 2013

Four years ago while co-writing two chapters for an Oracle Database book, an interesting test case was put together to demonstrate how enqueues are handled when multiple sessions are competing for some of the same database resources – the test case was intended to demonstrate the neat, orderly, and predictable process implemented by Oracle Database.  Oracle Database 10.2.0.1 through 10.2.0.4 behaved essentially the same when executing the test case script, while Oracle Database 11.1.0.6, 11.1.0.7, and 11.2.0.1 behaved differently when executing the test case script (11.2.0.2 and 11.2.0.3 behave just like 11.2.0.1 when executing the script).

Oracle Database 11.1.0.x and 11.2.0.x terminated the test case early with a deadlock error.  Jonathan Lewis offered an exceptional explanation of what happened with my test case script in his Lock Horror article.  He also mentioned a couple of interesting notes in comments attached to my article, one of which stated that Oracle Database 9.2.0.5 (or 9.2.0.4) had also introduced a change in enqueue rules related to foreign keys.  Richard Foote put together a related article that explained the changes in locking mode with foreign keys in his Oracle11g: New Locking Modes When Policing FK Constraints article.  Trying not to forget related articles (see the note at the end of this blog article), Tom Kyte also authored a blog article titled Something I recently unlearned… that is somewhat related to the test case script that appeared in the Expert Oracle Practices book.  So, the rules have changed, deadlocks should now be expected when executing the test case script.

I just started experimenting with Oracle Database 12.1.0.1.  My very first experiment with the new database version was to test the behavior of the enqueue script to verify that it results in a deadlock on Oracle Database 12.1.0.1.  Consistency is one of the key factors in a database application – if an application executes:

SELECT 1 + 1 FROM DUAL;

a developer should be reasonably certain that Oracle Database will either return the value 2 or the binary number 10 when the above SQL statement is executed, regardless of whether Oracle Database 6.x or Oracle Database 12.x is used by an application written by the developer.  The same developer should have a reasonable expectation that his application, if it is compatible with Oracle Database 9.0.1, should work pretty much the same way on Oracle Database 12.1.0.1.  But, what happens when Oracle Corp. fixes bugs in Oracle Database?

Consistency is important, but it is also important to recognize that behavior may change consistently.

Let’s reintroduce the test case script from my earlier article (which also appeared in the Expert Oracle Practices book).  Executing this script requires four sessions, each of which is logged in as a different user (to minimize the difficulty associated with deciphering the results of the SQL statements):

  • Session 1 connected as TESTUSER (the owner of the tables)
  • Session 2 connected as USER2
  • Session 3 connected as USER3
  • Session 4 connected as SYS (only to query the various performance views)

First, we need a couple of test tables with a declared foreign key relationship, without an index on the foreign key column.  In Session 1:

CREATE TABLE T1(
  C1 NUMBER(10) PRIMARY KEY);

INSERT INTO
  T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

CREATE TABLE T2(
  C1 NUMBER(10) PRIMARY KEY,
  C2 NUMBER(10),
  CONSTRAINT FK_T1_C1 FOREIGN KEY(C2) REFERENCES T1(C1) ENABLE);

INSERT INTO
  T2
SELECT
  ROWNUM,
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

GRANT ALL ON T1 TO PUBLIC;
GRANT ALL ON T2 TO PUBLIC;

CREATE PUBLIC SYNONYM T1 FOR T1;
CREATE PUBLIC SYNONYM T2 FOR T2;

In Session 1, insert a row into the parent and child tables without issuing a commit:

INSERT INTO T1 VALUES(100010);
INSERT INTO T2 VALUES(100010,100010);

In Session 2, insert a row into the parent and child tables, delete a row in the child table so that the primary key column value of the associated row in the parent table may be modified, and modify the parent row’s primary key column value:

INSERT INTO T1 VALUES(100020);
INSERT INTO T2 VALUES(100020,100020);
DELETE FROM T2 WHERE C1=50;
UPDATE T1 SET C1=100030 WHERE C1=50;

(Session 2 is hung)

Session 3 cannot see that a row was inserted by the now hung session 2, so session 3 attempts to insert a row with the same primary key value.  In Session 3:

INSERT INTO T1 VALUES(100020);

(Session 3 is hung)

The enqueues at this point on Oracle Database 10.2.0.4 differs from those found on Oracle Database 11.1.0.6, and (for consistency) those enqueues differ from what are found on Oracle Database 12.1.0.1.

In Session 4, let’s take a look at the enqueues (note that the join to the V$SESSION_WAIT performance view is unnecessary starting with Oracle Database 10.1.0.x, however I left that join in place to allow easy adaptation of the SQL statement so that it may work with older Oracle Database versions).

SET LINESIZE 165
SET PAGESIZE 1000
SET TRIMSPOOL ON
COLUMN SID FORMAT 9999
COLUMN PROGRAM FORMAT A11
COLUMN USERNAME FORMAT A8
COLUMN LMODE FORMAT 99
COLUMN REQUEST FORMAT 99
COLUMN CN FORMAT 99
COLUMN EVENT FORMAT A30
COLUMN STATE FORMAT A8
COLUMN S_I_W FORMAT 9999
COLUMN WT FORMAT 9999
COLUMN OBJ# FORMAT 999999
COLUMN FILE# FORMAT 999
COLUMN BLOCK# FORMAT 9999999
COLUMN P2 FORMAT 9999999
COLUMN P3 FORMAT 999999

SELECT
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME WT,
  SW.STATE,
  SW.SECONDS_IN_WAIT S_I_W,
  S.SQL_ID,
  S.SQL_CHILD_NUMBER CN,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#,
  SW.P1,
  SW.P2,
  SW.P3
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

  SID USERNAME PROGRAM     STATUS   EVENT                             WT STATE    S_I_W SQL_ID         CN    OBJ# FILE#   BLOCK# ROW#         P1       P2      P3
----- -------- ----------- -------- ------------------------------ ----- -------- ----- ------------- --- ------- ----- -------- ---- ---------- -------- -------
  184 USER2    sqlplus.exe ACTIVE   enq: TM - contention               0 WAITING     89 4rtg0hv0atfkx   0      -1     0        0    0 1414332421    91943       0
  243 USER3    sqlplus.exe ACTIVE   enq: TX - row lock contention      0 WAITING     49 cv338j6z2530g   0      -1     0        0    0 1415053316    65548    1531

For comparison, here is the output of the same SQL statement on 11.1.0.7 from the earlier blog article:

__SID USERNAME PROGRAM     STATUS   EVENT                             WT STATE    S_I_W SQL_ID         CN    OBJ# FILE#   BLOCK# ROW#         P1       P2      P3
----- -------- ----------- -------- ------------------------------ ----- -------- ----- ------------- --- ------- ----- -------- ---- ---------- -------- -------
  307 USER2    sqlplus.exe ACTIVE   enq: TM - contention               0 WAITING    422 4rtg0hv0atfkx   0      -1     0        0    0 1414332421    82913       0
  314 USER3    sqlplus.exe ACTIVE   enq: TM - contention               0 WAITING    407 cv338j6z2530g   0      -1     0        0    0 1414332419    82913       0

For comparison, here is the output of the same SQL statement on 10.2.0.4 from the earlier blog article:

__SID USERNAME PROGRAM     STATUS   EVENT                             WT STATE    S_I_W SQL_ID         CN    OBJ# FILE#   BLOCK# ROW#         P1       P2      P3
----- -------- ----------- -------- ------------------------------ ----- -------- ----- ------------- --- ------- ----- -------- ---- ---------- -------- -------
  204 USER2    sqlplus.exe ACTIVE enq: TM - contention                 0 WAITING    213 4rtg0hv0atfkx   0      -1     0        0    0 1414332421    16472       0
  217 USER3    sqlplus.exe ACTIVE enq: TM - contention                 0 WAITING    201 cv338j6z2530g   0      -1     0        0    0 1414332418    16472       0

… USER3 on 12.1.0.1 seems to not be waiting on a table level (TM) enqueue.

In Session 4, let’s try the second SQL statement to examine the lock modes:

SELECT
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.SQL_ID,
  S.SQL_CHILD_NUMBER CN,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  L.LMODE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
    (SELECT
      ID1,
      ID2,
      TYPE
    FROM
      V$LOCK
    WHERE
      REQUEST > 0)
  AND L.SID=S.SID;

  SID USERNAME PROGRAM     SQL_ID         CN ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LMODE REQUEST        ID1        ID2 TY    BLOCK
----- -------- ----------- ------------- --- ------------- -------------- --------------- ------------- ----- ------- ---------- ---------- -- --------
  127 TESTUSER sqlplus.exe                              -1              0               0             0     3       0      91943          0 TM        1
  184 USER2    sqlplus.exe 4rtg0hv0atfkx   0            -1              0               0             0     3       5      91943          0 TM        1
  184 USER2    sqlplus.exe 4rtg0hv0atfkx   0            -1              0               0             0     6       0      65548       1531 TX        1
  243 USER3    sqlplus.exe cv338j6z2530g   0            -1              0               0             0     2       0      91943          0 TM        0
  243 USER3    sqlplus.exe cv338j6z2530g   0            -1              0               0             0     0       4      65548       1531 TX        0

For comparison, here is the output of the same SQL statement on 11.1.0.7 from the earlier blog article:

__SID USERNAME PROGRAM     SQL_ID         CN ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LMODE REQUEST        ID1        ID2 TY    BLOCK
----- -------- ----------- ------------- --- ------------- -------------- --------------- ------------- ----- ------- ---------- ---------- -- --------
  320 TESTUSER sqlplus.exe 0vbusv12hnbk6   0         12517              1           29656             0     3       0      82913          0 TM        1
  307 USER2    sqlplus.exe 4rtg0hv0atfkx   0            -1              0               0             0     3       5      82913          0 TM        1
  314 USER3    sqlplus.exe cv338j6z2530g   0            -1              0               0             0     0       3      82913          0 TM        0

For comparison, here is the output of the same SQL statement on 10.2.0.4 from the earlier blog article:

__SID USERNAME PROGRAM     SQL_ID         CN ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LMODE REQUEST        ID1        ID2 TY    BLOCK
----- -------- ----------- ------------- --- ------------- -------------- --------------- ------------- ----- ------- ---------- ---------- -- --------
  213 TESTUSER sqlplus.exe 0vbusv12hnbk6   0             0              2             799             0     3       0      16472          0 TM        1
  204 USER2    sqlplus.exe 4rtg0hv0atfkx   0            -1              0               0             0     3       5      16472          0 TM        0
  217 USER3    sqlplus.exe cv338j6z2530g   0            -1              0               0             0     0       2      16472          0 TM        0

So, on Oracle Database 12.1.0.1, USER3 is not waiting on a table level (TM) equeue (it is holding a level 2 TM enqueue, while on 10.2.0.4 that session was trying to acquire a level 2 TM lock, and on 11.1.0.7 that session was trying to acquire a level 3 TM lock), it is instead waiting on a row level (TX – transaction) enqueue that is held in exclusive mode (level 6) by USER2.  Quite honestly, that result is similar to what I was hoping to see four years ago when the test script was put together, because USER3 really is waiting due to a potential primary key collision, if session 2 issues a commit.

Session 2 is hung, so it cannot issue a commit.  So, in Session 1:

COMMIT;

Session 2 is no longer hung, and now shows:

1 row updated.

Session 3, of course, remains hung waiting for Session 2 to either issue a COMMIT or a ROLLBACK to resolve the potential primary key collision.

For comparison, Session 3 on 11.1.0.7 from the earlier blog article showed the following:

INSERT INTO T1 VALUES(100020)
            *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

… and Session 2 showed the following on 11.1.0.7:

1 row updated.

Is it possible that the Session 2 could have displayed a deadlock message, rather than Session 3 showing a deadlock message?  I seem to remember reading that there is a good chance that the session that has been in the enqueue the longest could be selected as the “vicitim” of the deadlock, but I might be incorrectly remembering that rule.  Oracle Database really did not completely resolve the deadlock, nor did it “kill” session 3 (as too many books and blog article state would happen) when the deadlock message appeared in that session’s SQL*Plus session.

For comparison, Oracle Database 10.2.0.4 behaved similar to Oracle Database 12.1.0.1, with Session 2 showing 1 row updated, and Session 3 remaining hung.  The more things change, the more they stay the same?

Thoughts?  Curious about the test cases found on the web pages that I linked to earlier?  Does Oracle Database 12.1.0.1 behave more like 11.2.0.3 or 10.2.0.4 when foreign key columns are not properly indexed?


Actions

Information

5 responses

6 08 2013
Tim Hopkins

Hi Charles,

I was just putting something together on that myself. 12.1.0.1 is like 10.2.0.4 in the INSERT scenario (obtaining a Mode 2 rather than a Mode 3, as 11.2.0.3 does) but MERGE statements locked far too severely in 10.2.0.4. 11.2.0.x improved the MERGE locking situation but increased the INSERT lock level.
12.1.0.1 appears to bring these together and give us the best of both worlds.

I’ll try get a proper post ready by the weekend or thereabouts.

Cheers,
Tim

7 08 2013
Charles Hooper

Tim,

It is good to hear that other people are noticing the change in locking behavior. I am looking forward to seeing your article.

10 08 2013
Franck Pachot

Hi Charles,
Running on 12c a test case I have that checks the enqueues requested (event 10704) by different DML- result here – it shows that a few of the Row-S that changed to Row-X in 11g came back to Row-S in 12c.
But no change for the Share lock on unindexed foreign key. Just that the consequence is less bad as we have less Row-X (Row-X is blocked by Share but Row-S is allowed).
Regards,
Franck.

11 08 2013
Charles Hooper

Franck,

Thank you for sharing the extensive results of your testing in the link that you attached.

I had forgotten about event 10704 – it appears that you had mentioned it once before on my blog, in a comment attached to this article: https://hoopercharles.wordpress.com/2011/09/05/unindexed-foreign-keys-what-is-wrong-with-this-quote/
I might need to do some more testing with that event enabled.

Side note: you probably intended to change 12.2.0.1 to 12.1.0.1 in the article.

19 08 2013
Oracle Database 12c (12.1) Installation and New Features | DBLinks Consulting Ltée

[…] Enqueue Rules Leading to Deadlock Changes Again? […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 137 other followers

%d bloggers like this: