Deadlock on Oracle 11g but Not on 10g

7 01 2010

January 7, 2010

Is it possible to trigger a deadlock on Oracle Database 11.1.0.6, 11.1.0.7, and 11.2.0.1 when the exact same procedure does not trigger a deadlock on Oracle Database 10.2.0.4?  Trace files containing deadlocks always contain the following advice:

The following deadlock is not an ORACLE error. It is a deadlock due to user error in the design of an application or from issuing incorrect ad-hoc SQL.

In short, the above states that deadlocks are programming and/or end-user errors.

A couple of months after I generated a test case like the one below, I read an interesting blog article by Mark Bobak, a fellow OakTable Network member and frequent contributor to the Oracle OTN forums and Oracle-L list.  I then started to wonder if the test case I put together applied to Mark’s article (clarification: my test case was built roughly a year after Mark’s article was written, I only recently found Mark’s article).

I have a few favorite Oracle books, and I thought that I would quote from two of those books:

Expert Oracle Database Architecture: 9i and 10g Programming Techniques and Solutions

“So, when do you not need to index a foreign key? The answer is, in general, when the following conditions are met:

  • You do not delete from the parent table.
  • You do not update the parent table’s unique/primary key value (watch for unintended updates to the primary key by tools!).
  • You do not join from the parent to the child (like DEPT to EMP).”

Page 302 of the book “Expert One-On-One” and the reprint Expert Oracle, Signature Edition state essentially the same thing as the above book.

Oracle Wait Interface: a Practical Guide to Performance Diagnostics & Tuning

“Unindexed foreign keys are no longer an issue starting in Oracle9i Database.”

Interesting…

I have not read the following book, but found it through a Google search.  This book seems to take the middle ground:

OCA Oracle Database 11g: Administration I Exam Guide

“However, you should always create indexes on the foreign key columns within the child table for performance reasons: a DELETE on the parent table will be much faster if Oracle can use an index to determine whether there are any rows in the child table referencing the row that is being deleted.” 

OK, combining the three quotes – in general, there are only 3 cases when foreign key columns do not need to be indexed, as of Oracle 9i it is no longer necessary to index foreign key columns, but in Oracle 11g creating an index on a foreign key column will speed up deletes on the parent table.

Let’s keep the above in mind while we trigger a deadlock in Oracle 11.1.0.7. Refer back to the articles on Enqueues and Deadlocks to see if you are able to determine what happens in the test case (or more accurately WHY it happens) (clarification: this test case simulates a “screen painter” type program that updates every column in a row based on the fields on screen – the same behavior should be present if the primary key value in the parent table is updated to the same value, rather than a different value – it also relies on the fact that a row that is inserted by one session but not committed is not visible to another session until the first session commits).

First, we need a couple test tables:

CREATE TABLE T1(
  C1 NUMBER(10) PRIMARY KEY);

INSERT INTO
  T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

CREATE TABLE T2(
  C1 NUMBER(10) PRIMARY KEY,
  C2 NUMBER(10),
  CONSTRAINT FK_T1_C1 FOREIGN KEY(C2) REFERENCES T1(C1) ENABLE);

INSERT INTO
  T2
SELECT
  ROWNUM,
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

GRANT ALL ON T1 TO PUBLIC;
GRANT ALL ON T2 TO PUBLIC;

CREATE PUBLIC SYNONYM T1 FOR T1;
CREATE PUBLIC SYNONYM T2 FOR T2;

OK, we have two tables with a foreign key relationship, and no index on the foreign key column.  Now, we need four sessions:

  • Session 1 connected as TESTUSER (the owner of the tables)
  • Session 2 connected as USER2
  • Session 3 connected as USER3
  • Session 4 connected as SYS (only to query the various performance views)

Session 1:

INSERT INTO T1 VALUES(100010);
INSERT INTO T2 VALUES(100010,100010);

Session 2:

INSERT INTO T1 VALUES(100020);
INSERT INTO T2 VALUES(100020,100020);
DELETE FROM T2 WHERE C1=50;
UPDATE T1 SET C1=100030 WHERE C1=50;

{session 2 hangs}

Session 3:

INSERT INTO T1 VALUES(100020);

{session 3 hangs}

Session 4:

SELECT /*+ ORDERED */
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME WT,
  SW.STATE,
  SW.SECONDS_IN_WAIT S_I_W,
  S.SQL_ID,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.SQL_CHILD_NUMBER,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#,
  SW.P1,
  SW.P2,
  SW.P3
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

SID USERNAME PROGRAM     STATUS EVENT                WT STATE   S_I_W SQL_ID        SQL_ADDR SQL_HV     CN OBJ# FILE# BLOCK# ROW#         P1    P2 P3
--- -------- ----------- ------ -------------------- -- ------- ----- ------------- -------- ---------- -- ---- ----- ------ ---- ---------- ----- --
307 USER2    sqlplus.exe ACTIVE enq: TM - contention  0 WAITING   422 4rtg0hv0atfkx 224E2B7C 3232545373  0   -1     0      0    0 1414332421 82913  0
314 USER3    sqlplus.exe ACTIVE enq: TM - contention  0 WAITING   407 cv338j6z2530g 224DAE38 3189935119  0   -1     0      0    0 1414332419 82913  0

 

SELECT
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.SQL_ID,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.SQL_CHILD_NUMBER CN,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  L.LMODE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
    (SELECT
      ID1,
      ID2,
      TYPE
    FROM
      V$LOCK
    WHERE
      REQUEST > 0)
  AND L.SID=S.SID;

SID USERNAME PROGRAM     SQL_ID        SQL_ADDR SQL_HASH_VALUE CN ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LMODE REQUEST ID1   ID2 TY BLOCK
--- -------- ----------- ------------- -------- -------------- -- ------------- -------------- --------------- ------------- ----- ------- ----- --- -- -----
307 USER2    sqlplus.exe 4rtg0hv0atfkx 224E2B7C     3232545373  0            -1              0               0             0     3       5 82913   0 TM     1
314 USER3    sqlplus.exe cv338j6z2530g 224DAE38     3189935119  0            -1              0               0             0     0       3 82913   0 TM     0
320 TESTUSER sqlplus.exe 0vbusv12hnbk6 22480E10     1158295110  0         12517              1           29656             0     3       0 82913   0 TM     1

 

Session 1:

COMMIT;

Session 3 shows the following:

INSERT INTO T1 VALUES(100020)
            *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

Session 2 shows the following:

1 row updated.

The trace file contains the following deadlock information:

[Transaction Deadlock]

The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:

Deadlock graph:
 ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-000143e1-00000000        21     314    SX             23     307    SX   SSX
TX-0009000d-000030d3        23     307     X             21     314           S

session 314: DID 0001-0015-00000006 session 307: DID 0001-0017-00000001
session 307: DID 0001-0017-00000001 session 314: DID 0001-0015-00000006

Rows waited on:
  Session 314: no row
  Session 307: obj - rowid = 000143E0 - AAAUPgAAFAAFDpsAAA
  (dictionary objn - 82912, file - 5, block - 1325676, slot - 0)

----- Information for the OTHER waiting sessions -----
Session 307:
  sid: 307 ser: 2 audsid: 1210519 user: 186/USER2 flags: 0x100045
  pid: 23 O/S info: user: SYSTEM, term: TESTBOX, ospid: 940
    image: ORACLE.EXE (SHAD)
  client details:
    O/S info: user: TEST, term: TESTBOX, ospid: 1616:1264
    machine: TESTBOX program: sqlplus.exe
    application name: SQL*Plus, hash value=3669949024
  current SQL:

UPDATE T1 SET C1=100030 WHERE C1=50  

Interesting deadlock.  Now, let’s try the same test with Oracle 10.2.0.4:

The table definitions:

CREATE TABLE T1(
  C1 NUMBER(10) PRIMARY KEY);

INSERT INTO
  T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

CREATE TABLE T2(
  C1 NUMBER(10) PRIMARY KEY,
  C2 NUMBER(10),
  CONSTRAINT FK_T1_C1 FOREIGN KEY(C2) REFERENCES T1(C1) ENABLE);

INSERT INTO
  T2
SELECT
  ROWNUM,
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

GRANT ALL ON T1 TO PUBLIC;
GRANT ALL ON T2 TO PUBLIC;

CREATE PUBLIC SYNONYM T1 FOR T1;
CREATE PUBLIC SYNONYM T2 FOR T2;

Now for the test:

Session 1:

INSERT INTO T1 VALUES(100010);
INSERT INTO T2 VALUES(100010,100010);

Session 2:

INSERT INTO T1 VALUES(100020);
INSERT INTO T2 VALUES(100020,100020);
DELETE FROM T2 WHERE C1=50;
UPDATE T1 SET C1=100030 WHERE C1=50;

{session 2 hangs}

Session 3:

INSERT INTO T1 VALUES(100020);

{session 3 hangs}

Session 4:

SELECT /*+ ORDERED */
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME WT,
  SW.STATE,
  SW.SECONDS_IN_WAIT S_I_W,
  S.SQL_ID,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.SQL_CHILD_NUMBER,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#,
  SW.P1,
  SW.P2,
  SW.P3
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

SID USERNAME PROGRAM     STATUS EVENT                WT STATE   S_I_W SQL_ID        SQL_ADDR SQL_HV     CN OBJ# FILE# BLOCK# ROW#         P1    P2 P3
--- -------- ----------- ------ -------------------- -- ------- ----- ------------- -------- ---------- -- ---- ----- ------ ---- ---------- ----- --
204 USER2    sqlplus.exe ACTIVE enq: TM - contention  0 WAITING   213 4rtg0hv0atfkx 2AF83608 3232545373  0   -1     0      0    0 1414332421 16472  0
217 USER3    sqlplus.exe ACTIVE enq: TM - contention  0 WAITING   201 cv338j6z2530g 2AF85A5C 3189935119  0   -1     0      0    0 1414332418 16472  0

 

SELECT
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.SQL_ID,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.SQL_CHILD_NUMBER CN,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  L.LMODE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
    (SELECT
      ID1,
      ID2,
      TYPE
    FROM
      V$LOCK
    WHERE
      REQUEST > 0)
  AND L.SID=S.SID;

SID USERNAME PROGRAM     SQL_ID        SQL_ADDR SQL_HASH_VALUE CN ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LMODE REQUEST ID1   ID2 TY BLOCK
--- -------- ----------- ------------- -------- -------------- -- ------------- -------------- --------------- ------------- ----- ------- ----- --- -- -----
213 TESTUSER sqlplus.exe 0vbusv12hnbk6 2AF78CC4     1158295110  0             0              2             799             0     3       0 16472   0 TM     1
204 USER2    sqlplus.exe 4rtg0hv0atfkx 2AF83608     3232545373  0            -1              0               0             0     3       5 16472   0 TM     0
217 USER3    sqlplus.exe cv338j6z2530g 2AF85A5C     3189935119  0            -1              0               0             0     0       2 16472   0 TM     0

Session 1:

COMMIT;

Session 3:

{session 3 remains hung}

Session 2:

1 row updated.

Session 4:

SELECT /*+ ORDERED */
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME WT,
  SW.STATE,
  SW.SECONDS_IN_WAIT S_I_W,
  S.SQL_ID,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.SQL_CHILD_NUMBER,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#,
  SW.P1,
  SW.P2,
  SW.P3
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

SID USERNAME PROGRAM     STATUS EVENT                         WT STATE   S_I_W SQL_ID        SQL_ADDR SQL_HV     CN OBJ# FILE# BLOCK# ROW#         P1     P2   P3
--- -------- ----------- ------ ----------------------------- -- ------- ----- ------------- -------- ---------- -- ---- ----- ------ ---- ---------- ------ ----
217 USER3    sqlplus.exe ACTIVE enq: TX - row lock contention  0 WAITING   380 cv338j6z2530g 2AF85A5C 3189935119  0   -1     0      0    0 1415053316 458788 1297

 

SELECT
  S.SID,
  S.USERNAME,
  S.PROGRAM,
  S.SQL_ID,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.SQL_CHILD_NUMBER CN,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  L.LMODE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
    (SELECT
      ID1,
      ID2,
      TYPE
    FROM
      V$LOCK
    WHERE
      REQUEST > 0)
  AND L.SID=S.SID;

SID USERNAME PROGRAM     SQL_ID        SQL_ADDR SQL_HASH_VALUE CN ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LMODE REQUEST ID1    ID2  TY BLOCK
--- -------- ----------- ------------- -------- -------------- -- ------------- -------------- --------------- ------------- ----- ------- ------ ---- -- -----
217 USER3    sqlplus.exe cv338j6z2530g 2AF85A5C     3189935119  0            -1              0               0             0     0       4 458788 1297 TX     0
204 USER2    sqlplus.exe               00                    0               -1              0               0             0     6       0 458788 1297 TX     1

So, why did Oracle 11.1.0.7 deadlock while 10.2.0.4 did not deadlock (11.1.0.6 and 11.2.0.1 will also deadlock)?