Enqueue Experimentations

6 12 2009

December 6, 2009

Over the years I have participated in several discussions about enqueue waits.  Below are some of those discussions.

http://forums.oracle.com/forums/thread.jspa?threadID=638246
ENQ: TX – ROW LOCK CONTENTION – this is a row level lock caused by a possible primary key or unique index violation if the first session commits its current transaction. A second session attempting to insert the value that will result in the primary key violation will sit in this wait event until the first session commits or rolls back.

Problems with initrans would show up as ENQ: TX – ALLOCATE ITL ENTRY waits in 10g/11g. But, such problems should be rare with tablespaces using ASSM.

After fixing the formatting of the top 5 wait events (total duration unknown):

EVENT                        TOTAL_WAITS  TIME_WAITED   AVG_MS PERCENT
----------------------------------------------------------------------
CPU                                         94,623.39             48
enq: TX - row lock contention     12,531    36,607.28  2921.34    18
control file parallel write    1,300,731    30,880.79    23.74    16
log file parallel write        1,510,503    12,640.80     8.37     6
log file sync                  1,072,553     9,680.07     9.03     5

12,531 * 3 second time out = 37,593 seconds = 10.44 hours.

What if the reason for the 3 second average wait time is due to a timeout. I performed a little experiment… I changed a row in a test table and then made a pot of coffee.

In session 1:

CREATE TABLE T1 (
  C1 NUMBER(10),
  C2 NUMBER(10),
  PRIMARY KEY (C1));

INSERT INTO T1
SELECT
  ROWNUM,
  ROWNUM*10
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

I now have a test table with 1,000,000 rows. I start monitoring the changes in the wait events roughly every 60 seconds, and V$SESSION_WAIT and V$LOCK roughly 4 times per second.

Back in session 1:

UPDATE
  T1
SET
  C1=-C1
WHERE
  C1<=100;

I have now modified the first 100 rows that were inserted into the table, time to make the pot of coffee.

In session 2, I try to insert a row with a primary key value of -10:

INSERT INTO T1 VALUES (
  -10,
  10);

Session 2 hangs.

If I take the third 60 second snap of the system wide wait events as the zero point, and the 11th snap as the end point. There were 149 waits on ENQ: TX – ROW LOCK CONTENTION, 148 time outs, 446.62 seconds of total time in the wait event, with an average wait time of 2.997450 seconds.

Rolling down to the session level wait events, SID 208 (my session 2) had 149 waits on ENQ: TX – ROW LOCK CONTENTION, for a total time of 446.61 seconds with an average wait time of 2.997383 seconds. All of the 149 waits and the wait time was in this one session that was locked up for the full duration of this time period because session 1 was making a pot of coffee.

Rolling down to V$SESSION_WAIT (sampled roughly 4 times per second): At the start of the third time interval, SID 208 has been in the ENQ: TX – ROW LOCK CONTENTION wait event for 39 seconds and is actively waiting trying to execute SQL with a hash value of 1001532423, the wait object is -1, wait file is 0, wait block is 0, wait row is 0, P1 is 1415053316, P2 is 196646, P3 is 4754.
At the end of the 11th time interval: , SID 208 has been in the ENQ: TX – ROW LOCK CONTENTION wait event for 483 seconds and is actively waiting trying to execute SQL with a hash value of 1001532423, the wait object is -1, wait file is 0, wait block is 0, wait row is 0, P1 is 1415053316, P2 is 196646, P3 is 4754.

Rolling down to V$LOCK (sampled roughly 4 times per second): I see that SID 214 (session 1) is blocking SID 208 (session 2). SID 214 has a TX lock in mode 6 with ID1 of 196646 and ID2 of 4754. SID 208 is requesting a TX lock in mode 4 with ID1 of 196646 and ID2 of 4754.

So, it seems that I need a faster coffee pot rather than an additional index on my table. It could be that the above process would have found that the application associated with SID 214 was abandoned or crashed and for some reason the lock was not released for a long period of time, a little less than 10.44 hours in your case.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

http://forums.oracle.com/forums/thread.jspa?threadID=860488
TYPE=TX, LMODE=6, REQUEST=6 = session 2 is attempting to update a row that is locked by session 1’s transaction (likely session 1 already updated the row)
TYPE=TX, LMODE=6, REQUEST=4 = session 2 is attempting to insert a row with the same primary key value or unique index value as a row just inserted by session 1
TYPE=TX, LMODE=6, REQUEST=4 = ITL Problem or bitmap index fragment problem
RBS Seg # = trunc(p2/65536) = v$transaction.xidusn
RBS Slot # = bitand( p2, 65535) = v$transaction.xidslot
P3 = v$transaction.xidsqn

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

http://forums.oracle.com/forums/thread.jspa?messageID=2613515

Let’s try a couple experiments to see if we can reproduce what you are seeing (this may not be the only possible cause):

#Basic Setup
Session 1:

CREATE TABLE T1 (
  C1 VARCHAR2(10),
  C2 NUMBER(12),
  PRIMARY KEY (C1));

Table created.

INSERT INTO
  T1
VALUES(
  'ONE',
  1);

1 row created.

INSERT INTO
  T1
VALUES(
  'TWO',
  2);

1 row created.
SELECT DISTINCT
  SID
FROM
  V$MYSTAT;

       SID
----------
       150

#Test 1 (Primary Key Violation, No Commit)
Session 2:

SELECT DISTINCT
  SID
FROM
  V$MYSTAT;

       SID
----------
       143

INSERT INTO
  T1
VALUES(
  'TWO',
  2);

(Session 2 Hung)
In Session 1:

SELECT
  S.SID,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME,
  SW.STATE,
  SW.SECONDS_IN_WAIT,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  SW.P1 ,
  SW.P2,
  SW.P3
FROM
  V$SESSION S,
  V$SESSION_WAIT SW
WHERE 
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue','wait for unread message on broadcast channel');

       SID PROGRAM       STATUS   EVENT                          WAIT_TIME STATE     SECONDS_IN_WAIT SQL_ADDR SQL_HASH_VALUE ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#         P1         P2         P3
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       143 sqlplus.exe   ACTIVE   enq: TX - row lock contention          0 WAITING               424 1CA703CC     2983910641            -1              0               0             0 1415053316     393235      11543

 

SELECT
  S.SID,
  S.PROGRAM,
  S.SQL_HASH_VALUE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.LMODE,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
  (SELECT
    ID1,
    ID2,
    TYPE
  FROM
    V$LOCK
  WHERE
    REQUEST > 0)
  AND L.SID=S.SID;

       SID PROGRAM     SQL_HASH_VALUE    REQUEST        ID1        ID2      LMODE TY      BLOCK
-----------------------------------------------------------------------------------------------
       143 sqlplus.exe     2983910641          4     393235      11543          0 TX          0
       150 sqlplus.exe      883681740          0     393235      11543          6 TX          1

#Test 1 missed the mark, no ROW_WAIT_OBJ#

COMMIT;

In Session 2:

ERROR at line 1:
ORA-00001: unique constraint (TESTUSER.SYS_C0014294) violated

#Test 2 (Primary Key Updated, No Commit)
In Session 1:

UPDATE
  T1
SET
  C1=C1
WHERE
  C1='TWO';

1 row updated.

In Session 2:

UPDATE
  T1
SET
  C1=C1
WHERE
  C1='TWO';

In Session 1:

SELECT
  S.SID,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME,
  SW.STATE,
  SW.SECONDS_IN_WAIT,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  SW.P1 ,
  SW.P2,
  SW.P3
FROM
  V$SESSION S,
  V$SESSION_WAIT SW
WHERE 
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue','wait for unread message on broadcast channel');

       SID PROGRAM       STATUS   EVENT                          WAIT_TIME STATE     SECONDS_IN_WAIT SQL_ADDR SQL_HASH_VALUE ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#         P1         P2         P3
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       143 sqlplus.exe   ACTIVE   enq: TX - row lock contention           0 WAITING               38 222799C4     1367536044         73419              1           84362             1 1415053318     589828      11227

 

SELECT
  S.SID,
  S.PROGRAM,
  S.SQL_HASH_VALUE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.LMODE,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
  (SELECT
    ID1,
    ID2,
    TYPE
  FROM
    V$LOCK
  WHERE
    REQUEST > 0)
  AND L.SID=S.SID;

       SID PROGRAM     SQL_HASH_VALUE    REQUEST        ID1        ID2      LMODE TY      BLOCK
-----------------------------------------------------------------------------------------------
       143 sqlplus.exe     1367536044          6     589828      11227          0 TX          0
       150 sqlplus.exe      883681740          0     589828      11227          6 TX          1

#Test 2 missed the mark, ROW_WAIT_OBJ# exists and ROW_WAIT_ROW# exists

COMMIT;

In Session 2:

COMMIT;

#Test 3 (Unique Index Violation) 
In Session 1:

CREATE UNIQUE INDEX IND_T1 ON T1(C2);

Index created.

INSERT INTO
  T1
VALUES(
  'THREE',
  3);

1 row created.

In Session 2:

INSERT INTO
  T1
VALUES(
  'FOUR',
  3);

(Session 2 Hung)
In Session 1:

SELECT
  S.SID,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME,
  SW.STATE,
  SW.SECONDS_IN_WAIT,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  SW.P1 ,
  SW.P2,
  SW.P3
FROM
  V$SESSION S,
  V$SESSION_WAIT SW
WHERE 
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue','wait for unread message on broadcast channel');

       SID PROGRAM       STATUS   EVENT                          WAIT_TIME STATE    SECONDS_IN_WAIT SQL_ADDR SQL_HASH_VALUE ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#         P1         P2         P3
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       143 sqlplus.exe   ACTIVE   enq: TX - row lock contention          0 WAITING               41 1C8F8B60      463727853            -1              1           84362             1 1415053316     393242      11541

 

SELECT
  S.SID,
  S.PROGRAM,
  S.SQL_HASH_VALUE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.LMODE,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
  (SELECT
    ID1,
    ID2,
    TYPE
  FROM
    V$LOCK
  WHERE
    REQUEST > 0)
  AND L.SID=S.SID;

       SID PROGRAM     SQL_HASH_VALUE    REQUEST        ID1        ID2      LMODE TY      BLOCK
-----------------------------------------------------------------------------------------------
       143 sqlplus.exe      463727853          4     393242      11541          0 TX          0
       150 sqlplus.exe      883681740          0     393242      11541          6 TX          1

#Test 3 missed the mark, no ROW_WAIT_OBJ#
In Session 1:

COMMIT;

Commit complete.

In Session 2:

ERROR at line 1:
ORA-00001: unique constraint (TESTUSER.IND_T1) violated

#Test 4 (Unique Index Violation)
In Session 1:

UPDATE
  T1
SET
  C2=C2+1
WHERE
  C2=3;

1 row updated.

In Session 2:

INSERT INTO
  T1
VALUES(
  'FOUR',
  3);

(Session 2 Hung)
In Session 1:

SELECT
  S.SID,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME,
  SW.STATE,
  SW.SECONDS_IN_WAIT,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  SW.P1 ,
  SW.P2,
  SW.P3
FROM
  V$SESSION S,
  V$SESSION_WAIT SW
WHERE 
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue','wait for unread message on broadcast channel');

       SID PROGRAM       STATUS   EVENT                          WAIT_TIME STATE    SECONDS_IN_WAIT SQL_ADDR SQL_HASH_VALUE ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#         P1         P2         P3
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       143 sqlplus.exe   ACTIVE   enq: TX - row lock contention          0 WAITING               18 1C8F8B60      463727853            -1              1           84362             1 1415053316      65560       8736

 

SELECT
  S.SID,
  S.PROGRAM,
  S.SQL_HASH_VALUE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.LMODE,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
  (SELECT
    ID1,
    ID2,
    TYPE
  FROM
    V$LOCK
  WHERE
    REQUEST > 0)
  AND L.SID=S.SID;

       SID PROGRAM     SQL_HASH_VALUE    REQUEST        ID1        ID2      LMODE TY      BLOCK
-----------------------------------------------------------------------------------------------
       143 sqlplus.exe      463727853          4      65560       8736          0 TX          0
       150 sqlplus.exe      883681740          0      65560       8736          6 TX          1

#Test 4 missed the mark, ROW_WAIT_OBJ# does not exist and ROW_WAIT_ROW# exists

ROLLBACK;

In Session 2:

ROLLBACK;

#Test 5 (Select For Update in Two Sessions)
In Session 1:

SELECT
  *
FROM
  T1
FOR UPDATE;

C1                 C2
---------------------
ONE                 1
TWO                 2
THREE               3

In Session 1:

SELECT
  *
FROM
  T1
FOR UPDATE;

(Session 2 Hung)
In Session 1:

SELECT
  S.SID,
  S.PROGRAM,
  S.STATUS,
  SW.EVENT,
  SW.WAIT_TIME,
  SW.STATE,
  SW.SECONDS_IN_WAIT,
  S.SQL_ADDRESS,
  S.SQL_HASH_VALUE,
  S.ROW_WAIT_OBJ#,
  S.ROW_WAIT_FILE#,
  S.ROW_WAIT_BLOCK#,
  S.ROW_WAIT_ROW#,
  SW.P1 ,
  SW.P2,
  SW.P3
FROM
  V$SESSION S,
  V$SESSION_WAIT SW
WHERE 
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue','wait for unread message on broadcast channel');

       SID PROGRAM       STATUS   EVENT                          WAIT_TIME STATE    SECONDS_IN_WAIT SQL_ADDR SQL_HASH_VALUE ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#         P1         P2         P3
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       143 sqlplus.exe   ACTIVE   enq: TX - row lock contention          0 WAITING               18 1C911EE4     3199333328         73419              1           84362             0 1415053318     327707      11953

 

SELECT
  S.SID,
  S.PROGRAM,
  S.SQL_HASH_VALUE,
  L.REQUEST,
  L.ID1,
  L.ID2,
  L.LMODE,
  L.TYPE,
  L.BLOCK
FROM
  V$LOCK L,
  V$SESSION S
WHERE
  (L.ID1, L.ID2, L.TYPE) IN
  (SELECT
    ID1,
    ID2,
    TYPE
  FROM
    V$LOCK
  WHERE
    REQUEST > 0)
  AND L.SID=S.SID;

       SID PROGRAM     SQL_HASH_VALUE    REQUEST        ID1        ID2      LMODE TY      BLOCK
-----------------------------------------------------------------------------------------------
       143 sqlplus.exe     3199333328          6     327707      11953          0 TX          0
       150 sqlplus.exe      883681740          0     327707      11953          6 TX          1

#Test 5 ROW_WAIT_OBJ# exist and ROW_WAIT_ROW# does not exists, lock request mode is 6 and the other session is holding the lock in mode 6.  It looks like one possibility was found.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

http://forums.oracle.com/forums/thread.jspa?messageID=3188695
Based on a calculation it appears that the Statspack duration is about 20 minutes (this is likely reasonable). I did not spend a lot of time analyzing the information that you posted, so a quick summary of what I see that looks interesting:

                                                                  Avg
                                                     Total Wait   wait    Waits
Event                               Waits   Timeouts  Time (s)    (ms)     /txn
---------------------------- ------------ ---------- ---------- ------ --------
enqueue                                83         58        208   2509      0.1
resmgr:waiting in end wait            265          0         25     94      0.4
log file parallel write             1,939          0         22     11      3.2
resmgr:waiting in check                 8          0         17   2112      0.0

 

In the above, notice that there are 58 time outs for the enqueue. The timeout period for an enqueue is 3 seconds, which means that you potentially had a single wait that lasted 58*3 (+ up to 2.99) seconds, which totals 174 (+ up to 2.99) seconds. To make the calculation simple, assume that this was a single wait lasting exactly 174 seconds. The remaining 47 seconds of the wait divided by 25 (83-58) means that the remaining average wait time is 1.88 seconds. With Oracle 9.0.1 it is more difficult to see how much time was spent in each of the different types of enqueues compared to Oracle 10g and above.

Looking at the enqueue information that you posted:

                                                       Avg Wt        Wait
Eq     Requests    Succ Gets Failed Gets       Waits  Time (ms)    Time (s)
-- ------------ ------------ ----------- ----------- ----------- ------------
TX          569          568           0          30    7,356.07          221

 

It appears that all of the enqueue activity is of the type TX, while are row level enqueues – usually meaning that someone updated a row, and there was a pause between the time that the update was performed and a commit or rollback was executed. There are multiple causes for TX enqueues which may not be found with a Statspack report. From my notes (possibly based on the Oracle Performance Tuning Guide):
* TX enqueues are acquired exclusive when a transaction initiates its first change and held until the transaction COMMITs or ROLLBACK.
* Waits for TX in mode 6 occurs when a session is waiting for a row level lock that is already held by another session. The wait event for mode 6 TX enqueues is ‘enq: TX – row lock contention’ on 10g and above.
* Waits for TX in mode 4 can occur when a session wants to lock a row in the block, but one or more other sessions have rows locked in the same block and there is no free Interested Transaction LIST slot in the block. Oracle will usually dynamically add another ITL slot if there is sufficient space in the block. To fix this problem, increase INITRANS or MAXTRANS for the table. This type of TX enqueue corresponds to the wait event ‘enq: TX – allocate ITL’ on 10g and above.
* Waits for TX in mode 4 can also occur if the session is waiting for potential duplicates in a UNIQUE index, which would occur if a second session inserts a value before the first tries to insert the same value, and the second session has not yet committed. This type of enqueue corresponds to the wait event: ‘enq: TX – row lock contention’ on 10g and above.
* ‘Waits for TX in mode 4 can also occur when a transaction is inserting a row into an index and must wait for the end of an index block split being done by another transaction. This type of TX enqueue wait corresponds to the wait event ‘enq: TX – index’ on 10g and above.
You might want to look at the SQL statements sorted by CPU utilization based on the above description.

Taking a look at a couple SQL statements that you posted:

                                                     CPU      Elapsd
  Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------                                                          
      3,301,423           91       36,279.4   67.5     0.01      0.01 4086952936
UPDATE IPAO.EMP_MAST SET OSCAL_CODE=:b1 WHERE :b1 < 600

        902,040           30       30,068.0   18.5     0.00      0.00 2333288512
UPDATE IPAO.MON_ACT SET CHEQUE_NO=:b1,CHEQUE_DT=:b2,CHEQUE_AMT=:b3 WHERE V_NO = :b4  AND V_DATE = :b5

 

The first of the two looks interesting. I am a little surprised by the small number for the CPU utilization considering the number of buffer gets. The WHERE clause “WHERE :b1 < 600″ seems a little unusual.





Deadlock Experimentations

6 12 2009

December 6, 2009

Did you know that it is possible for multiple sessions to incorrectly trigger a deadlock on Oracle 11.1.0.6, 11.1.0.7, and 11.2.0.1 by executing a sequence of commands, when the same sequence of commands does not trigger a deadlock on 10.2.0.4?  See chapter 8 “Understanding Performance Optimization Methods” of the “Expert Oracle Practices: Oracle Database Administration from the Oak Table” book for the test case.

Over the years I have posted several test cases that will intentionally trigger a deadlock to permit others to locate a matching deadlock pattern to help identify a cause of their deadlock.  Below are a couple of those test cases.

http://forums.oracle.com/forums/thread.jspa?threadID=689654

In the demo, the sessions are holding SX (Row exclusive – write access to a compound resource) locks due to a select for update, rather than S (Share – One of more sessions are reading) locks, but with a bit more experimentation, I am sure that it would be possible to create a demo where both the sessions are holding S locks.

The demo:

/* First, the creation of the tables with a foreign key index */
CREATE TABLE T1(C1 NUMBER(10) PRIMARY KEY);

INSERT INTO T1 VALUES(1);
INSERT INTO T1 VALUES(2);
INSERT INTO T1 VALUES(3);
INSERT INTO T1 VALUES(4);

COMMIT;

CREATE TABLE T2(
  C1 NUMBER(10) PRIMARY KEY,
  C2 NUMBER(10),
  CONSTRAINT FK_T1_C1 FOREIGN KEY(C2) REFERENCES T1(C1) ENABLE);

CREATE INDEX IND_T2_C2 ON T2(C2);

INSERT INTO T2 VALUES (1,1);
INSERT INTO T2 VALUES (2,2);
INSERT INTO T2 VALUES (3,3);
INSERT INTO T2 VALUES (4,4);

COMMIT;

 

/* Test 1 - No problems */
/* SESSION 1 */
SELECT
  *
FROM
  T2
WHERE
  C2=1
FOR UPDATE;

        C1         C2
---------------------
         1          1

 

/* SESSION 2 */
SELECT
  *
FROM
  T2
WHERE
  C2=2
FOR UPDATE;

        C1         C2
---------------------
         2          2

 

UPDATE
  T1
SET
  C1=2
WHERE
  C1=2;

1 row updated.

 

/* SESSION 1 */
UPDATE
  T1
SET
  C1=1
WHERE
  C1=1;

1 row updated.

ROLLBACK;

 

/* SESSION 2 */
ROLLBACK;

/* Test 2 – What happens when we repeat the test without the foreign key index */

DROP INDEX IND_T2_C2;

/* SESSION 2 */
SELECT
  *
FROM
  T2
WHERE
  C2=2
FOR UPDATE;

        C1         C2
---------------------
         2          2

 

/* SESSION 1 */
SELECT
  *
FROM
  T2
WHERE
  C2=1
FOR UPDATE;

        C1         C2
---------------------
         1          1

UPDATE
  T1
SET
  C1=1
WHERE
  C1=1;
/* HANGS */

/* SESSION 2 */
UPDATE
  T1
SET
  C1=2
WHERE
  C1=2;
/* HANGS */

 

/* SESSION 1 */
  T1
  *
ERROR at line 2:
ORA-00060: deadlock detected while waiting for resource

 

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TM-0000d63f-00000000        17     208    SX   SSX       18     210    SX   SSX
TM-0000d63f-00000000        18     210    SX   SSX       17     208    SX   SSX
session 208: DID 0001-0011-000000CE session 210: DID 0001-0012-000000D7
session 210: DID 0001-0012-000000D7 session 208: DID 0001-0011-000000CE
Rows waited on:
Session 210: obj - rowid = 000027EB - AAACfrAABAAAG/BAAA
  (dictionary objn - 10219, file - 1, block - 28609, slot - 0)

Your deadlock graph showed that both sessions were trying to obtain a TM lock in mode 5 (SSX), but the other sessions were already holding locks in mode 4 (S), and neither session was able to continue.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

http://forums.oracle.com/forums/thread.jspa?threadID=689654

If the deadlock points to a type TX, then the problem is probably not related to foreign keys indexing problems.

Another test setup to produce a deadlock with lock type TX:

/* The setup (assumes that the tables from the previous test setup still exist) */
DROP TABLE T2;
DROP TABLE T1;
DROP TABLE T3;

CREATE TABLE T1(C1 NUMBER(10) PRIMARY KEY, C2 NUMBER(10));

INSERT INTO T1 VALUES(1,NULL);
INSERT INTO T1 VALUES(2,NULL);
INSERT INTO T1 VALUES(3,NULL);
INSERT INTO T1 VALUES(4,NULL);

COMMIT;

CREATE TABLE T2(C1 NUMBER(10) PRIMARY KEY, C2 NUMBER(10));

INSERT INTO T2 VALUES(1,NULL);
INSERT INTO T2 VALUES(2,NULL);
INSERT INTO T2 VALUES(3,NULL);
INSERT INTO T2 VALUES(4,NULL);

COMMIT;

CREATE TABLE T3(TRANSACTION_ID NUMBER(10) PRIMARY KEY);

INSERT INTO T3 VALUES(1);
INSERT INTO T3 VALUES(2);
INSERT INTO T3 VALUES(3);
INSERT INTO T3 VALUES(4);

COMMIT;

/* Now we have two data tables and a third table, which could be interesting if there were 3 sessions involved */

/* Test 3 - session 1 updates a row, waits, session 2 updates 2 rows and hangs */
/* SESSION 1 */
UPDATE
  T1
SET
  C2=C1
WHERE
  C1=1;

 

/* SESSION 2 */
UPDATE
  T2
SET
  C2=C1
WHERE
  C1=1;

UPDATE
  T1
SET
  C2=C1
WHERE
  C1=1;
/* HANGS */

/* SESSION 1 */
UPDATE
  T2
SET
  C2=C1
WHERE
  C1=1;
/* HANGS */

/* SESSION 2 */
  T1
  *
ERROR at line 2:
ORA-00060: deadlock detected while waiting for resource

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00010003-0000238f        18     208     X             17     210           X
TX-00090005-00002383        17     210     X             18     208           X
session 208: DID 0001-0012-000000D9 session 210: DID 0001-0011-000000D0
session 210: DID 0001-0011-000000D0 session 208: DID 0001-0012-000000D9
Rows waited on:
Session 210: obj - rowid = 0000D644 - AAANZEAAEAAFvr2AAA
  (dictionary objn - 54852, file - 4, block - 1506038, slot - 0)
Session 208: obj - rowid = 0000D642 - AAANZCAAEAAFvrmAAA
  (dictionary objn - 54850, file - 4, block - 1506022, slot - 0)

 

/* SESSION 2 */
ROLLBACK;

 

/* SESSION 1 */
ROLLBACK;

 
/* Test 4 – Transaction table that does not use a sequence contributes to the problem */

/* SESSION 1 */
UPDATE
  T1
SET
  C2=C1
WHERE
  C1=1;

INSERT INTO
  T3
SELECT
  MAX(TRANSACTION_ID)+1
FROM
  T3;

 

/* SESSION 2 */
UPDATE
  T2
SET
  C2=C1+1
WHERE
  C1=1;

INSERT INTO
  T3
SELECT
  MAX(TRANSACTION_ID)+1
FROM
  T3;
/* HANGS */

 

/* SESSION 1 */
DELETE FROM T2;
/* HANGS */

 

/* SESSION 2 */
  T3
  *
ERROR at line 2:
ORA-00060: deadlock detected while waiting for resource

 

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00060029-00002376        18     208     X             17     210           X
TX-00090016-00002386        17     210     X             18     208           S
session 208: DID 0001-0012-000000D9 session 210: DID 0001-0011-000000D0
session 210: DID 0001-0011-000000D0 session 208: DID 0001-0012-000000D9
Rows waited on:
Session 210: obj - rowid = 0000D644 - AAANZEAAEAAFvr2AAA
  (dictionary objn - 54852, file - 4, block - 1506038, slot - 0)
Session 208: obj - rowid = 0000D642 - AAANZCAAEAAFvrmAAA
  (dictionary objn - 54850, file - 4, block - 1506022, slot - 0)

Test 5 (if setup) might have used 3 sessions, the first session would update a row in T1 and then insert a row into T3. The second session would update a row in T2 and attempt to insert a row into T3 (session 2 would hang). Session 3 would update a different row in T2, and then attempt to insert a row into T3 (session 3 would hang). Session 1 would see that the row originally updated by session 2 was not updated, and attempt to update that row (session 1 would hang). The Deadlock graph?

Deadlocks almost always point to application design problems (or user design problems when the user attempts to start the same batch process twice).

One thing that you do not know is what the two sessions were doing before they executed the insert into T2 statement. The first set of numbers in the deadlock graph is the hex value of an OBJECT_ID. If you execute the following query:

SELECT
  OWNER,
  OBJECT_NAME
FROM
  DBA_OBJECTS
WHERE
  OBJECT_ID=241021;

Does it show the table T2, T1, T3, or some other table?

You might find the following Metalink documents helpful:
Note:15476.1 FAQ about Detecting and Resolving Locking Conflicts
Note:198828.1 Analysing locking issues with LOGMINER
Note:102925.1 Tracing sessions: waiting on an enqueue

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

http://forums.oracle.com/forums/thread.jspa?threadID=867295
The setup that I was trying to describe was one in which the TM lock could not be obtained due to another session already holding a TX lock on the child, something like this:

CREATE TABLE T1(C1 NUMBER(10) PRIMARY KEY);

INSERT INTO T1 VALUES(1);
INSERT INTO T1 VALUES(2);
INSERT INTO T1 VALUES(3);
INSERT INTO T1 VALUES(4);

COMMIT;

CREATE TABLE T2(
  C1 NUMBER(10) PRIMARY KEY,
  C2 NUMBER(10),
  CONSTRAINT FK_T1_C1 FOREIGN KEY(C2) REFERENCES T1(C1) ENABLE);

INSERT INTO T2 VALUES (1,1);
INSERT INTO T2 VALUES (2,2);
INSERT INTO T2 VALUES (3,3);
INSERT INTO T2 VALUES (4,4);

COMMIT;

In session 1:
SELECT
  *
FROM
  T2
WHERE
  C2=1
FOR UPDATE;

In session 2:
INSERT INTO T1 VALUES(15);

UPDATE
  T1
SET
  C1=15
WHERE
  C1=15;

(Session 2 hangs trying to acquire a TM lock on the child table)

In session 1:
INSERT INTO T1 VALUES(15);

(Session 1 and 2 deadlock)

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-000a0025-000038bf        17     213     X             18     210           S
TM-0000d79a-00000000        18     210    SX             17     213    SS     S
session 213: DID 0001-0011-000003C2 session 210: DID 0001-0012-000002F1
session 210: DID 0001-0012-000002F1 session 213: DID 0001-0011-000003C2
Rows waited on:
Session 210: no row
Session 213: no row

 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In session 1

CREATE TABLE T1(
  MY_ID VARCHAR2(30),
  MY_DATE DATE,
  CUSTOMER_ID VARCHAR2(15),
  PRIMARY KEY (MY_ID));

INSERT INTO T1 VALUES(
  '001',
  TO_DATE('01-JAN-2006','DD-MON-YYYY'),
  'C001');

INSERT INTO T1 VALUES(
  '002',
  TO_DATE('01-JAN-2006','DD-MON-YYYY'),
  'C002');

INSERT INTO T1 VALUES(
  '003',
  TO_DATE('01-JAN-2006','DD-MON-YYYY'),
  'C003');

INSERT INTO T1 VALUES(
  '004',
  TO_DATE('01-JAN-2006','DD-MON-YYYY'),
  'C001');

COMMIT;

INSERT INTO T1 VALUES(
  '005',
  TO_DATE('01-JAN-2007','DD-MON-YYYY'),
  'C003');

INSERT INTO T1 VALUES(
  '006',
  TO_DATE('01-JAN-2007','DD-MON-YYYY'),
  'C001');

In session 2:

INSERT INTO T1 VALUES(
  '007',
  TO_DATE('01-FEB-2007','DD-MON-YYYY'),
  'C003');

INSERT INTO T1 VALUES(
  '008',
  TO_DATE('01-FEB-2007','DD-MON-YYYY'),
  'C001');

UPDATE
  T1
SET
  MY_DATE=MY_DATE+1
WHERE
  MY_ID='001';

UPDATE
  T1
SET
  MY_DATE=MY_DATE+1
WHERE
  MY_ID='002';

UPDATE
  T1
SET
  MY_DATE=MY_DATE+1
WHERE
  MY_ID='003';

INSERT INTO T1 VALUES(
  '005',
  TO_DATE('01-FEB-2007','DD-MON-YYYY'),
  'C001');

(The above will be a primary key violation if session 1 commits, so session 2 hangs)

In session 2’s 10046 trace file:

------------
*** 2008-02-12 10:09:27.984
WAIT #2: nam='enq: TX - row lock contention' ela= 3000118 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1749514958
WAIT #2: nam='enq: TX - row lock contention' ela= 3000116 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1752515272
WAIT #2: nam='enq: TX - row lock contention' ela= 3000157 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1755515569
WAIT #2: nam='enq: TX - row lock contention' ela= 3000084 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1758515803
*** 2008-02-12 10:09:39.983
WAIT #2: nam='enq: TX - row lock contention' ela= 3000079 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1761516009
WAIT #2: nam='enq: TX - row lock contention' ela= 2999987 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1764516178
WAIT #2: nam='enq: TX - row lock contention' ela= 2999994 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1767516313
WAIT #2: nam='enq: TX - row lock contention' ela= 3000015 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1770516482
*** 2008-02-12 10:09:51.983
WAIT #2: nam='enq: TX - row lock contention' ela= 3000084 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1773516699
WAIT #2: nam='enq: TX - row lock contention' ela= 3000063 name|mode=1415053316 usn<<16 | slot=131104 sequence=8831 obj#=-1 tim=1776516940
------------

In session 1:

SELECT
  *
FROM
  T1
FOR UPDATE;

Session 1 hangs

Session 2 immediately displays:

INSERT INTO TESTUSER.T1 VALUES(
                    *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

In session 2’s 10046 trace file:

DEADLOCK DETECTED
[Transaction Deadlock]
Current SQL statement for this session:
INSERT INTO TESTUSER.T1 VALUES(
  '005',
  TO_DATE('01-FEB-2007','DD-MON-YYYY'),
  'C001')
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-0006002d-000023ef        25     201     X             22     204           X
TX-00020020-0000227f        22     204     X             25     201           S
session 201: DID 0001-0019-000001D2 session 204: DID 0001-0016-000000FF
session 204: DID 0001-0016-000000FF session 201: DID 0001-0019-000001D2
Rows waited on:
Session 204: obj - rowid = 0000829B - AAAIKbAAEAAEBROAAA

In session 2:

SELECT
  *
FROM
  T1
WHERE
  ROWID='AAAIKbAAEAAEBROAAA';

MY_ID                          MY_DATE   CUSTOMER_ID
------------------------------ --------- -----------
001                            02-JAN-06 C001

Session 2’s update of the row with MY_ID =’001′ was pinpointed in the deadlock.

In session 1’s 10046 trace file:

PARSING IN CURSOR #2 len=31 dep=0 uid=288 oct=3 lid=288 tim=1776466041 hv=3199333328 ad='46baac58'
SELECT
  *
FROM
  T1
FOR UPDATE
END OF STMT
PARSE #2:c=0,e=93402,p=0,cr=17,cu=0,mis=1,r=0,dep=0,og=1,tim=1776466031
BINDS #2:
WAIT #2: nam='enq: TX - row lock contention' ela= 2997875 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1779470295
WAIT #2: nam='enq: TX - row lock contention' ela= 3000109 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1782470542
WAIT #2: nam='enq: TX - row lock contention' ela= 3000151 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1785470824
*** 2008-02-12 10:10:06.936
WAIT #2: nam='enq: TX - row lock contention' ela= 3000127 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1788471107
WAIT #2: nam='enq: TX - row lock contention' ela= 2999987 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1791471260
WAIT #2: nam='enq: TX - row lock contention' ela= 3000059 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1794471450
WAIT #2: nam='enq: TX - row lock contention' ela= 3000020 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1797471623
*** 2008-02-12 10:10:18.936
WAIT #2: nam='enq: TX - row lock contention' ela= 3000059 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1800471821
WAIT #2: nam='enq: TX - row lock contention' ela= 2999913 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1803471915
WAIT #2: nam='enq: TX - row lock contention' ela= 3000109 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1806472157
WAIT #2: nam='enq: TX - row lock contention' ela= 3000086 name|mode=1415053318 usn<<16 | slot=393261 sequence=9199 obj#=33435 tim=1809472374

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Experimenting with a 4 session deadlock:

Session 1:
CREATE TABLE T1(C1 NUMBER PRIMARY KEY, C2 NUMBER CONSTRAINT CON_T1 UNIQUE);
CREATE TABLE T2(C1 NUMBER PRIMARY KEY, C2 NUMBER CONSTRAINT CON_T2 UNIQUE);
CREATE TABLE T3(C1 NUMBER PRIMARY KEY, C2 NUMBER CONSTRAINT CON_T3 UNIQUE);
CREATE TABLE T4(C1 NUMBER PRIMARY KEY, C2 NUMBER CONSTRAINT CON_T4 UNIQUE);

INSERT INTO T1 VALUES (1,1);
INSERT INTO T2 VALUES (1,1);
INSERT INTO T3 VALUES (1,1);
INSERT INTO T4 VALUES (1,1);

COMMIT;

INSERT INTO T1
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T1;

UPDATE T1 SET C2=3 WHERE C1=1;

Session 2:
INSERT INTO T2 SELECT   2,   MAX(C2)+1 C2 FROM   T2;   UPDATE T2 SET C2=3 WHERE C1=1;  

Session 3:
INSERT INTO T3
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T3;

UPDATE T3 SET C2=3 WHERE C1=1;

Session 4:
INSERT INTO T4
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T4;

UPDATE T4 SET C2=3 WHERE C1=1;

Session 1:
INSERT INTO T2
SELECT
  3,
  MAX(C2)+1 C2
FROM
  T2;

(Session 1 hung)

Session 2:
INSERT INTO T3
SELECT
  3,
  MAX(C2)+1 C2
FROM
  T3;

(Session 2 hung)

Session 3:
INSERT INTO T4
SELECT
  3,
  MAX(C2)+1 C2
FROM
  T4;

(Session 3 hung)

Session 4:
INSERT INTO T1
SELECT
  3,
  MAX(C2)+1 C2
FROM
  T1;

(Session 4 hung, triggers a deadlock in session 3)

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-0003000f-00002d44        21     144     X             20     145           S
TX-0004000e-0000218e        20     145     X             19     146           S
TX-00070013-000021f4        19     146     X             22     143           S
TX-000a0021-000021f1        22     143     X             21     144           S

session 144: DID 0001-0015-00000007 session 145: DID 0001-0014-00000003
session 145: DID 0001-0014-00000003 session 146: DID 0001-0013-00000003
session 146: DID 0001-0013-00000003 session 143: DID 0001-0016-00000001
session 143: DID 0001-0016-00000001 session 144: DID 0001-0015-00000007

Rows waited on:
  Session 144: no row
  Session 145: no row
  Session 146: no row
  Session 143: no row

(Rollback all sessions)

 

Session 1:
UPDATE T1 SET C2=3 WHERE C1=1;

Session 2:
UPDATE T2 SET C2=3 WHERE C1=1;

Session 3:
UPDATE T3 SET C2=3 WHERE C1=1;

Session 4:
UPDATE T4 SET C2=3 WHERE C1=1;

Session 1:
UPDATE T2 SET C2=3 WHERE C1=1;

(Session 1 hung)

Session 2:
UPDATE T3 SET C2=3 WHERE C1=1;

(Session 2 hung)

Session 3:
UPDATE T4 SET C2=3 WHERE C1=1;

(Session 3 hung)

Session 4:
UPDATE T1 SET C2=3 WHERE C1=1;

(Session 4 hung, triggers a deadlock in session 1)

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00010012-000021f2        19     146     X             22     143           X
TX-00030010-00002d4b        22     143     X             21     144           X
TX-0009000a-00002c69        21     144     X             20     145           X
TX-00060003-00002c5d        20     145     X             19     146           X

session 146: DID 0001-0013-00000003 session 143: DID 0001-0016-00000001
session 143: DID 0001-0016-00000001 session 144: DID 0001-0015-00000007
session 144: DID 0001-0015-00000007 session 145: DID 0001-0014-00000003
session 145: DID 0001-0014-00000003 session 146: DID 0001-0013-00000003

Rows waited on:
  Session 146: obj - rowid = 00015185 - AAAVGFAAEAAFps4AAA
  (dictionary objn - 86405, file - 4, block - 1481528, slot - 0)
  Session 143: obj - rowid = 00015182 - AAAVGCAAEAAFpsgAAA
  (dictionary objn - 86402, file - 4, block - 1481504, slot - 0)
  Session 144: obj - rowid = 0001518B - AAAVGLAAEAAFptoAAA
  (dictionary objn - 86411, file - 4, block - 1481576, slot - 0)
  Session 145: obj - rowid = 00015188 - AAAVGIAAEAAFptQAAA
  (dictionary objn - 86408, file - 4, block - 1481552, slot - 0)

 

Session 1:
INSERT INTO T1
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T1;

Session 2:
INSERT INTO T2
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T2;

Session 3:
INSERT INTO T3
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T3;

Session 4:
INSERT INTO T4
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T4;

Session 1:
INSERT INTO T2
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T2;

(Session 1 hung)

Session 2:
INSERT INTO T3
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T3;

(Session 2 hung)

Session 3:
INSERT INTO T4
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T4;

(Session 3 hung)

Session 4:
INSERT INTO T1
SELECT
  2,
  MAX(C2)+1 C2
FROM
  T1;

(Session 4 hung, triggers a deadlock in session 2)

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-0004001d-000008c5        24     144     X             22     143           S
TX-0006001c-00000b8e        22     143     X             25     152           S
TX-00080003-00000b7c        25     152     X             23     158           S
TX-00090027-00000b7c        23     158     X             24     144           S
session 144: DID 0001-0018-00000002 session 143: DID 0001-0016-00000002
session 143: DID 0001-0016-00000002 session 152: DID 0001-0019-00000002
session 152: DID 0001-0019-00000002 session 158: DID 0001-0017-00000006
session 158: DID 0001-0017-00000006 session 144: DID 0001-0018-00000002
Rows waited on:
Session 143: no row
Session 152: no row
Session 158: obj - rowid = 0000BD35 - AAAL01AAEAAGGSOAAA
  (dictionary objn - 48437, file - 4, block - 1598606, slot - 0)
Session 144: no row

(Rollback all sessions)

 

Session 1:
UPDATE T1 SET C1=3 WHERE C1=1;

Session 2:
UPDATE T2 SET C1=3 WHERE C1=1;

Session 3:
UPDATE T3 SET C1=3 WHERE C1=1;

Session 4:
UPDATE T4 SET C1=3 WHERE C1=1;

Session 1:
UPDATE T2 SET C1=3 WHERE C1=1;

(Session 1 hung)

Session 2:
UPDATE T3 SET C1=3 WHERE C1=1;

(Session 2 hung)

Session 3:
UPDATE T4 SET C1=3 WHERE C1=1;

(Session 3 hung)

Session 4:
UPDATE T1 SET C1=3 WHERE C1=1;

(Session 4 hung, triggers a deadlock in session 3)

                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00070028-00000871        23     158     X             24     144           X
TX-0006000c-00000b8f        24     144     X             22     143           X
TX-0005002d-00000b88        22     143     X             25     152           X
TX-00090004-00000b7d        25     152     X             23     158           X
session 158: DID 0001-0017-00000006 session 144: DID 0001-0018-00000002
session 144: DID 0001-0018-00000002 session 143: DID 0001-0016-00000002
session 143: DID 0001-0016-00000002 session 152: DID 0001-0019-00000002
session 152: DID 0001-0019-00000002 session 158: DID 0001-0017-00000006
Rows waited on:
Session 144: obj - rowid = 0000BD32 - AAAL0yAAEAAFhH2AAA
  (dictionary objn - 48434, file - 4, block - 1446390, slot - 0)
Session 143: obj - rowid = 0000BD2F - AAAL0vAAEAAFhHeAAA
  (dictionary objn - 48431, file - 4, block - 1446366, slot - 0)
Session 152: obj - rowid = 0000BD2C - AAAL0sAAEAAFOjOAAA
  (dictionary objn - 48428, file - 4, block - 1370318, slot - 0)
Session 158: obj - rowid = 0000BD35 - AAAL01AAEAAGGSOAAA
  (dictionary objn - 48437, file - 4, block - 1598606, slot - 0)

(Rollback all sessions)

 

Session 1:
ALTER TABLE T1 ADD (C3 NUMBER);
ALTER TABLE T2 ADD (C3 NUMBER);
ALTER TABLE T3 ADD (C3 NUMBER);
ALTER TABLE T4 ADD (C3 NUMBER);

Session 1:
UPDATE T1 SET C3=3 WHERE C1=1;

Session 2:
UPDATE T2 SET C3=3 WHERE C1=1;

Session 3:
UPDATE T3 SET C3=3 WHERE C1=1;

Session 4:
UPDATE T4 SET C3=3 WHERE C1=1;

Session 1:
UPDATE T2 SET C3=3 WHERE C1=1;

(Session 1 hung)

Session 2:
UPDATE T3 SET C3=3 WHERE C1=1;

(Session 2 hung)

Session 3:
UPDATE T4 SET C3=3 WHERE C1=1;

(Session 3 hung)

Session 4:
UPDATE T1 SET C3=3 WHERE C1=1;

(Session 4 hung, triggers a deadlock in session 1)

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00090005-00002c7c        19     146     X             22     143           X
TX-0005001c-00002cb3        22     143     X             21     144           X
TX-0007000f-0000223a        21     144     X             20     145           X
TX-00080020-00002d01        20     145     X             19     146           X

session 146: DID 0001-0013-00000003 session 143: DID 0001-0016-00000001
session 143: DID 0001-0016-00000001 session 144: DID 0001-0015-00000007
session 144: DID 0001-0015-00000007 session 145: DID 0001-0014-00000003
session 145: DID 0001-0014-00000003 session 146: DID 0001-0013-00000003

Rows waited on:
  Session 146: obj - rowid = 00015185 - AAAVGFAAEAAFps4AAA
  (dictionary objn - 86405, file - 4, block - 1481528, slot - 0)
  Session 143: obj - rowid = 00015182 - AAAVGCAAEAAFpsgAAA
  (dictionary objn - 86402, file - 4, block - 1481504, slot - 0)
  Session 144: obj - rowid = 0001518B - AAAVGLAAEAAFptoAAA
  (dictionary objn - 86411, file - 4, block - 1481576, slot - 0)
  Session 145: obj - rowid = 00015188 - AAAVGIAAEAAFptQAAA
  (dictionary objn - 86408, file - 4, block - 1481552, slot - 0)




Faulty Quotes 2 – Test Cases

6 12 2009

December 6, 2009 (Updated February 24, 2010)

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

Are test cases valuable or not?  These quotes were found through Internet searches:

http://forums.oracle.com/forums/thread.jspa?threadID=587394

“Oracle is NOT MATH, a single contrary test case does not invalidate any general principle of Oracle performance.”

http://forums.oracle.com/forums/thread.jspa?threadID=963129

“Just because someone show a case where indexes do not need to be rebuilt, that DOES NOT mean that positive cases don’t exist!

I can write a test case to ‘prove’ that virtually any statement about Oracle is un-true. It’s easy.

People who believe that a single negative test case proves something is wrong join the ranks of the ‘deniers’, folks who cite ‘proof’ that the moon landing never happened, and that 911 was a government conspiracy.”

http://forums.oracle.com/forums/thread.jspa?messageID=3195898&tstart=0 (thread was taken offline by OTN, but still shows in search results)

“100% true? You are joking, right? There is NOTHING that is 100% true about database tuning . . . . YOU CANNOT PROVE ANYTHING ABOUT ORACLE PERFORMANCE. EVER. NO EQUATIONS, NO PROOFS, NO WAY, NO HOW. . . . Only fools or charlatans will claim that something about Oracle performance has been ‘proven wrong’. . . .

http://forums.oracle.com/forums/thread.jspa?threadID=963129&start=0

“It’s ridiculous to reverse engineer Oracle with test cases, when we can ask the people who hold the source code.”

“A test case is not the same as software testing!

The problem is that a single-user ‘test case’ on a PC is not a valid test, by any measure. . .

It does not accurately reproduce real-world behavior, especially in performance tuning, where slowdowns are only seen under heavy loads.”

“It baffles me why any practicng DBA would want to write a test case, when they have a real-world test database, full of real data and waiting to be used . . .”

————————-

(Added February 24, 2010):

If you search this site for the phrase test case, I think that it will be clear that test cases, when properly constructed, are extremely helpful for determining how things work – and how things should not work.  Properly constructing a test case is critical to help eliminate false causation and false correlation.  Employing the scientific method (secondary reference) is important when building test cases to help control false positives and false negatives.  Keep in mind that if something is stated as an absolute (for example “the sun rises in the East” or “on Earth the sun always rises from the East“), it only requires a single negative test case to refute the absolute statement.





COLLECTION ITERATOR (PICKLER FETCH) Appearing in DBMS_XPLAN Output

6 12 2009

December 6, 2009

In several discussion threads a DBMS_XPLAN is requested, and the provided output will occasionally look like this:

-----------------------------------------------------
| Id  | Operation                         | Name    |
-----------------------------------------------------
|   1 |  COLLECTION ITERATOR PICKLER FETCH| DISPLAY |
-----------------------------------------------------

One such case appeared in this discussion thread:
http://forums.oracle.com/forums/thread.jspa?threadID=656614&start=0

In short, the problem is caused by either enabling server output, or enabling SQL*Plus’ autotrace.

A quick test setup that I provided in the above thread to reproduce the problem:

CREATE TABLE T1 (C1 NUMBER(12));

INSERT INTO
  T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

COMMIT;

Let’s try an experiment with a simple SQL statement:

SELECT
  C1
FROM
  T1
WHERE
  C1 BETWEEN 90 AND 150;

With the following settings:

OPTIMIZER_MODE=CHOOSE
OPTIMIZER_FEATURES_ENABLE=9.0.1
STATISTICS_LEVEL=ALL  (session level)

 

SQL_ID  84bwrtyfrxhzx, child number 0
--------------------------------------------------------------------------------
SELECT    C1  FROM    T1  WHERE    C1 BETWEEN 90 AND 150
Plan hash value: 3617692013
---------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| T1   |      1 |     61 |00:00:00.01 |      23 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("C1"<=150 AND "C1">=90))
Note
--------------------------------------------------------------------------------
   - rule based optimizer used (consider using cbo)

(That warning surprised me a bit… as this is running on Oracle 11.1.0.6 and the rule based optimizer was deprecated in 10g R1.)

Partial output from a 10046 at level 12 enabled at the same time:

=====================
PARSING IN CURSOR #3 len=56 dep=0 uid=63 oct=3 lid=63 tim=657639757986 hv=2642330621 ad='1af34a2c' sqlid='84bwrtyfrxhzx'
SELECT
  C1
FROM
  T1
WHERE
  C1 BETWEEN 90 AND 150
END OF STMT
PARSE #3:c=0,e=701,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=657639757982
BINDS #3:
EXEC #3:c=0,e=57,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=657639758209
WAIT #3: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657639758249
FETCH #3:c=15600,e=713,p=0,cr=23,cu=0,mis=0,r=61,dep=0,og=4,tim=657639759004
STAT #3 id=1 cnt=61 pid=0 pos=1 obj=72628 op='TABLE ACCESS FULL T1 (cr=23 pr=0 pw=0 time=99 us)'
WAIT #3: nam='SQL*Net message from client' ela= 9637 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657639779290
=====================
...
PARSING IN CURSOR #4 len=80 dep=0 uid=63 oct=3 lid=63 tim=657640013624 hv=266232738 ad='22039234' sqlid='0nfjn6n7xwsx2'
SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'))
END OF STMT
PARSE #4:c=46800,e=234176,p=2,cr=199,cu=0,mis=1,r=0,dep=0,og=4,tim=657640013619
WAIT #4: nam='SQL*Net message to client' ela= 8 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657640013799
WAIT #4: nam='SQL*Net message from client' ela= 570 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657640014444
BINDS #4:
EXEC #4:c=0,e=58,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=657640014557
WAIT #4: nam='SQL*Net message to client' ela= 9 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657640495379
FETCH #4:c=171601,e=480979,p=15,cr=888,cu=0,mis=0,r=21,dep=0,og=4,tim=657640495566
STAT #4 id=1 cnt=21 pid=0 pos=1 obj=0 op='COLLECTION ITERATOR PICKLER FETCH DISPLAY_CURSOR (cr=888 pr=15 pw=15 time=81 us)'

In the above 10046 trace, note the STAT line containing “COLLECTION ITERATOR PICKLER FETCH DISPLAY_CURSOR” on the last line – that is the execution plan for the call to DBMS_XPLAN.DISPLAY_CURSOR, and not the original query that was of interest. That was the plan that Lokesh provided to Jonathan, and not the one that Jonathan requested.

Let’s try again to remove the “rule based optimizer used” warning:

OPTIMIZER_MODE=ALL_ROWS
OPTIMIZER_FEATURES_ENABLE=11.1.0.6
STATISTICS_LEVEL=ALL  (session level)

The DBMS_XPLAN:

SQL_ID  84bwrtyfrxhzx, child number 1
--------------------------------------------------------------------------------
SELECT    C1  FROM    T1  WHERE    C1 BETWEEN 90 AND 150
Plan hash value: 3617692013
------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| T1   |      1 |     61 |     61 |00:00:00.01 |      23 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("C1">=90 AND "C1"<=150))
Note
--------------------------------------------------------------------------------
   - dynamic sampling used for this statement

 
Partial output from a 10046 at level 12 enabled at the same time:

=====================
PARSING IN CURSOR #12 len=56 dep=0 uid=63 oct=3 lid=63 tim=657812448763 hv=2642330621 ad='1af34a2c' sqlid='84bwrtyfrxhzx'
SELECT
  C1
FROM
  T1
WHERE
  C1 BETWEEN 90 AND 150
END OF STMT
PARSE #12:c=46801,e=54061,p=0,cr=24,cu=0,mis=1,r=0,dep=0,og=1,tim=657812448759
BINDS #12:
EXEC #12:c=0,e=51,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=657812449038
WAIT #12: nam='SQL*Net message to client' ela= 8 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657812449085
FETCH #12:c=0,e=768,p=0,cr=23,cu=0,mis=0,r=61,dep=0,og=1,tim=657812449899
STAT #12 id=1 cnt=61 pid=0 pos=1 obj=72628 op='TABLE ACCESS FULL T1 (cr=23 pr=0 pw=0 time=75 us cost=7 size=793 card=61)'
WAIT #12: nam='SQL*Net message from client' ela= 10282 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=657812460271
=====================
...
PARSING IN CURSOR #11 len=80 dep=0 uid=63 oct=3 lid=63 tim=657812468416 hv=266232738 ad='22039234' sqlid='0nfjn6n7xwsx2'
SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'))
END OF STMT
PARSE #11:c=0,e=7984,p=0,cr=111,cu=0,mis=1,r=0,dep=0,og=1,tim=657812468413
FETCH #11:c=62400,e=56650,p=0,cr=274,cu=0,mis=0,r=21,dep=0,og=1,tim=657812526032
STAT #11 id=1 cnt=21 pid=0 pos=1 obj=0 op='COLLECTION ITERATOR PICKLER FETCH DISPLAY_CURSOR (cr=274 pr=0 pw=0 time=86 us)'

I hope that clears things up for you. I tried the following, but was not able to reproduce DBMS_XPLAN returning the plan for DBMS_XPLAN, rather than the last plan executed:

ALTER SESSION SET STATISTICS_LEVEL='ALL';

ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';

SET AUTOTRACE ON STATISTICS

SELECT
  C1
FROM
  T1
WHERE
  C1 BETWEEN 90 AND 150;

 

Statistics
--------------------------------------------------------------------------------
          0  recursive calls
          0  db block gets
         28  consistent gets
          0  physical reads
          0  redo size
       1279  bytes sent via SQL*Net to client
        460  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         61  rows processed

 

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

 

SQL> SET AUTOTRACE ON STATISTICS
SQL> SET SERVEROUTPUT ON
SQL> SELECT /*+ gather_plan_statistics */
  2    C1
  3  FROM
  4    T1
  5  WHERE
  6    C1 BETWEEN 90 AND 150;
        C1
--------------------------------------------------------------------------------
        90
        91
        92
...
       148
       149
       150
61 rows selected.

Statistics
--------------------------------------------------------------------------------
          0  recursive calls
          0  db block gets
         28  consistent gets
          0  physical reads
          0  redo size
       1029  bytes sent via SQL*Net to client
        378  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         61  rows processed

 

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  9babjv8yq8ru3, child number 0
BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;
NOTE: cannot fetch plan for SQL_ID: 9babjv8yq8ru3, CHILD_NUMBER: 0
      Please verify value of SQL_ID and CHILD_NUMBER;
      It could also be that the plan is no longer in cursor cache (check v$sql_plan)
--------------------------------------------------------------------------------




Determining Why a Query Using MIN(column) in the WHERE Clause on an Indexed Column takes a Long Time

6 12 2009

December 6, 2009

Some time ago the following question appeared in the comp.database.oracle.server Usenet group:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/79d0a3d85381dc64

I have a basic event table in my 10g database, primary integer key, and a nonnull timestamp “eventdate” field.  When executing the following command:

select min(eventdate) from events;

It can occasionally take a VERY long time.  There is a standard btree
index on “eventdate”, and roughly 20 other columns, a few of which
also have basic btree indexes.

In my current scenario, the table has less than a million records, and
the query seems to be taking upwards of 10 minutes!!!  Here’s what the
console says the plan is:

Operation       Object  Object Type     Order   Rows    Size (KB)       Cost    Time (sec)      CPU
Cost    I/O Cost
SELECT STATEMENT                        3                               4
 SORT AGGREGATE                         2       1       0.011
  INDEX FULL SCAN (MIN/MAX) EVENTS_EVENTDATE    INDEX   1       736195  7,908.345
4       1       28686   4

I’ve run the ADDM, it updated the stats, but that doesn’t seem to help
(i run it again it has no recommendations).  Am i doing something
silly with this table or is there some better way for me to phrase
this query?

I provided the following test case to demonstrate one possible reason for a MIN(column) query to require a longer time to execute than it should:

CREATE TABLE T1 (EVENTDATE TIMESTAMP NOT NULL);
CREATE INDEX T1_IND1 ON T1(EVENTDATE);
INSERT INTO
  T1
SELECT
  TRUNC(SYSDATE-3000)+ROWNUM/100
FROM
  DUAL
CONNECT BY
  LEVEL<=900000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE);

SELECT
  MIN(EVENTDATE)
FROM
  T1;

The DBMS Xplan:

 
---------------------------------------------------------------------------­--------------------
| Id  | Operation                  | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------­--------------------
|   1 |  SORT AGGREGATE            |         |      1 |      1 |      1 |00:00:00.01 |       3 |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_IND1 |      1 |    904K|      1 |00:00:00.01 |       3 |
---------------------------------------------------------------------------­--------------------

From the 10046 trace file:

PARSE #1:c=0,e=6954,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=5794606730
EXEC #1:c=0,e=107,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5794608342
WAIT #1: nam='SQL*Net message to client' ela= 7 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=5794608829
FETCH #1:c=0,e=91,p=0,cr=3,cu=0,mis=0,r=1,dep=0,og=1,tim=5794609296
WAIT #1: nam='SQL*Net message from client' ela= 10946 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=5794620734
STAT #1 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=3 pr=0 pw=0 time=93 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=32286 op='INDEX FULL SCAN (MIN/MAX) T1_IND1 (cr=3 pr=0 pw=0 time=55 us)'

In the above, Oracle used an INDEX FULL SCAN (MIN/MAX) to find the minimum using an optimization, requiring only 3 consistent reads (cr=3 and Buffers = 3), rather than having to scan all of the blocks in the index.  Oracle required 91/1000000 of a second to fetch the result.
Now, let’s see what happens when other sessions are involved.  In a second session:

DELETE FROM
  T1
WHERE
  EVENTDATE<SYSDATE-1000;

200,053 ROWS DELETED

(no commit performed)
Back to the first session to see how the query is affected:

SELECT
  MIN(EVENTDATE)
FROM
  T1;

The DBMS Xplan:

---------------------------------------------------------------------------­--------------------
| Id  | Operation                  | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------­--------------------
|   1 |  SORT AGGREGATE            |         |      1 |      1 |      1 |00:00:00.01 |       6 |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_IND1 |      1 |    904K|      1 |00:00:00.01 |       6 |
---------------------------------------------------------------------------­--------------------

From the 10046 trace file:

 
PARSE #1:c=0,e=140,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5929305668
EXEC #1:c=0,e=106,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5929307277
WAIT #1: nam='SQL*Net message to client' ela= 6 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=5929307728
FETCH #1:c=0,e=7540,p=0,cr=6,cu=0,mis=0,r=1,dep=0,og=1,tim=5929315743
WAIT #1: nam='SQL*Net message from client' ela= 1728 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=5929318030
STAT #1 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=6 pr=0 pw=0 time=7540 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=32286 op='INDEX FULL SCAN (MIN/MAX) T1_IND1 (cr=6 pr=0 pw=0 time=7501 us)'

Note that there are now 6 consistent reads, rather than 3, and that the elapsed time for the FETCH has increased from 91 to 7540.
In a third session:

DELETE FROM
  T1
WHERE
  EVENTDATE>SYSDATE+400;

559,947 ROWS DELETED

Back to the first session:

SELECT
  MIN(EVENTDATE)
FROM
  T1;

The DBMS Xplan:

---------------------------------------------------------------------------­--------------------
| Id  | Operation                  | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------­--------------------
|   1 |  SORT AGGREGATE            |         |      1 |      1 |      1 |00:00:00.01 |       6 |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_IND1 |      1 |    904K|      1 |00:00:00.01 |       6 |
---------------------------------------------------------------------------­--------------------

From the 10046 trace file:

PARSE #3:c=0,e=89,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=6311133836
EXEC #3:c=0,e=167,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=6311135482
WAIT #3: nam='SQL*Net message to client' ela= 6 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=6311135920
FETCH #3:c=0,e=565,p=0,cr=6,cu=0,mis=0,r=1,dep=0,og=1,tim=6311136874
WAIT #3: nam='SQL*Net message from client' ela= 1520 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=6311138894
STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=6 pr=0 pw=0 time=568 us)'
STAT #3 id=2 cnt=1 pid=1 pos=1 obj=32286 op='INDEX FULL SCAN (MIN/MAX) T1_IND1 (cr=6 pr=0 pw=0 time=530 us)'

No change, other than the elapsed time decreasing.
In the third session:

ROLLBACK;

INSERT INTO
  T1
SELECT
  TRUNC(SYSDATE-3001)+ROWNUM/100
FROM
  DUAL
CONNECT BY
  LEVEL<=2;

INSERT INTO
  T1
SELECT
  TRUNC(SYSDATE-3030)+ROWNUM/100
FROM
  DUAL
CONNECT BY
  LEVEL<=3000;

DELETE FROM
  T1
WHERE
  EVENTDATE>SYSDATE+4000;

Back to the first session:

SELECT
  MIN(EVENTDATE)
FROM
  T1;

The DBMS Xplan:

---------------------------------------------------------------------------­--------------------
| Id  | Operation                  | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------­--------------------
|   1 |  SORT AGGREGATE            |         |      1 |      1 |      1 |00:00:00.04 |     365 |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_IND1 |      1 |    904K|      1 |00:00:00.04 |     365 |
---------------------------------------------------------------------------­--------------------

From the 10046 trace file:

PARSE #7:c=0,e=54,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=6738688175
EXEC #7:c=0,e=103,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=6738689672
WAIT #7: nam='SQL*Net message to client' ela= 6 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=6738690122
FETCH #7:c=0,e=44496,p=0,cr=365,cu=0,mis=0,r=1,dep=0,og=1,tim=6738734984
WAIT #7: nam='SQL*Net message from client' ela= 1605 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=6738737134
STAT #7 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=365 pr=0 pw=0 time=44494 us)'
STAT #7 id=2 cnt=1 pid=1 pos=1 obj=32286 op='INDEX FULL SCAN (MIN/MAX) T1_IND1 (cr=365 pr=0 pw=0 time=44453 us)'

Note now that Oracle performed 365 consistent reads, and that the elapsed time has increased from 565 to 44496.  Actual time in the DBMS Xplan reports an increase from 0.01 seconds to 0.04 seconds.
You might take a look at the output of the following to help determine if Oracle is having to rollback uncommitted data to provide a consistent view of the data due to an in-process transaction:

SELECT
  *
FROM
  V$TRANSACTION;

SELECT
  *
FROM
  V$LOCK;

The original poster provided the following and sent to me a trace file for the query execution:

I tried your experiment, and achieved similar responses – i’m a complete novice and don’t know exactly how you got the 10046 trace output, but i did do the explain stuff:

---------------------------------------------------------------------------­----------
| Id  | Operation                  | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------­----------
|   0 | SELECT STATEMENT           |         |     1 |    11 |       3 (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |         |     1 |    11 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_IND1 |   901K|  9685K|       3 (0)| 00:00:01 |
---------------------------------------------------------------------------­----------

I think this is comparable, but please advise if not.
2nd question – my oracle web console tells me i’m running 10.2.0.1.0. I attempted to enable the plustrace role, and do as suggested, here’s what i got:
SQL> select min(eventdate) from events;

 
MIN(EVENTDATE)
---------------------------------------------------------------------------
21-JAN-08 04.51.45.525000 PM

Execution Plan
----------------------------------------------------------
Plan hash value: 116994577
---------------------------------------------------------------------------­---------------------
| Id  | Operation                  | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------­---------------------
|   0 | SELECT STATEMENT           |                    |     1 |    11 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |                    |     1 |    11 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| EVENTS_EVENTDATE   |   736K|  7908K|     4   (0)| 00:00:01 |
---------------------------------------------------------------------------­---------------------
Statistics
----------------------------------------------------------
        901  recursive calls
          0  db block gets
     118525  consistent gets
     118248  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         18  sorts (memory)
          0  sorts (disk)
          1  rows processed

Does this explain anything?  Here’s the same output using the “new” table, which for whatever reason is much better:

SQL> select min(eventdate) from t1;

MIN(EVENTDATE)
---------------------------------------------------------------------------
11-NOV-99 12.14.24.000000 AM

Execution Plan

----------------------------------------------------------
Plan hash value: 3630964933
---------------------------------------------------------------------------­----------
| Id  | Operation                  | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------­----------
|   0 | SELECT STATEMENT           |         |     1 |    11 |       3 (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |         |     1 |    11 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_IND1 |   901K|  9685K|       3 (0)| 00:00:01 |
---------------------------------------------------------------------------­----------
Statistics
----------------------------------------------------------
        676  recursive calls
          0  db block gets
        132  consistent gets
          6  physical reads
          0  redo size
        422  bytes sent via SQL*Net to client
        381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         22  sorts (memory)
          0  sorts (disk)
          1  rows processed

 In the above, notice the 118,525 consistent gets and the 118,248 for the original table compared to the 132 consistent gets and 6 physical reads for the test table that the OP created, which indicates that Oracle was having trouble immediately jumping to the index block containing the minimum value when performing the INDEX FULL SCAN (MIN/MAX) operation.

The OP provided the following follow up:

One followup question, that you’ll all likely laugh at.  I got into this “mess” by having a table that we frequently perform delete from TABLE where DATE < ? – this situation is going to continue to arise. Should i simply schedule frequent index rebuilds?  I understand partitioning the data is probably the way to go, but what is frequent rebuilds the simplest solution here (by simple, i mean least knowledge/testing/sql involved)?

Jonathan Lewis provided an excellent response to the OP.





Faulty Quotes 1 – OPTIMIZER_INDEX_COST_ADJ

6 12 2009

December 5, 2009

(Forward to the Next Post in the Series)

There are several initialization parameters, some of which are hidden (all hidden parameters begin with an underscore character and should not be changed without authorization from Oracle support), which control memory utilization, execution plan costing, latch spin behavior, maximum I/O size for multiblock reads from disk, etc. It is easy to fall into a pattern of blindly changing the initialization parameter without identifying the actual source of the problem for which the initialization parameter change is expected to magically correct. The approach of blindly changing the initialization parameters without understanding the scope/purpose of the parameters nor the source of the problem is apparently encouraged by various Oracle books, official looking web pages, and web discussion threads.

One such parameter that is frequently abused is OPTIMIZER_INDEX_COST_ADJ.  This parameter specifies the percentage of the calculated index cost to retain.  A value of 1 for OPTIMIZER_INDEX_COST_ADJ is a bad idea, as it not only makes index access paths appear to be 1/100 times as expensive (1% of the original cost) which will drive the use of indexes, but also potentially causes the wrong index to be used if two or more indexes have the same (rounded) calculated cost.

Examples quotes recommending low values for this parameter, even in 10g R1 and above:

praetoriate.com/t_op_sql_index_behavior.htm and
http://books.google.com/books?id=4xEozkbvxboC&pg=RA1-PA359#v=onepage&q=&f=false

“The optimizer_index_cost_adj parameter is an initialization parameter that can be very useful for SQL tuning. It is a numeric parameter with values from zero to 1,000 and a default value of 1,000.”

praetoriate.com/t_op_sql_index_behavior.htm

“If you are having slow performance because the CBO first_rows mode is favoring too many full-table scans, you can reset the optimizer_index_cost_adj parameter to immediately tune all of the SQL in your database to favor index scans over full-table scans.”

http://books.google.com/books?id=TuzniwcsNtUC&pg=PT771#v=onepage&q=&f=false

“Unfortunately, the optimizer might do more full table scans as a result of setting DB_FILE_MULTIBLOCK_READ_COUNT (you don’t want this behavior), so you may also need to set the OPTIMIZER_INDEX_COST_ADJ to a number, such as 10, to eliminate this problem and drive the use of indexes.”

http://books.google.com/books?id=omq9aRx8s0EC&pg=PT171#v=onepage&q=&f=false

“If the use of DB_FILE_MULTIBLOCK_READ_COUNT starts to cause too many full table scans (since the optimizer now decides it can perform full table scans much faster and decides to do more of them) then set OPTIMIZER_INDEX_COST_ADJ between 1 and 10 (I usually use 10) to force index use more frequently.”

oracle-training.cc/oracle_tips_opq.htm

“The default value for optimizer_index_cost_adj is 1,000, and any value less than 1,000 makes the CBO view indexes less expensive. If you do not like the propensity of the CBO first_rows mode to favor full-table scans, you can lower the value of optimizer_index_cost_adj to 10, thereby telling the CBO to always favor index scans over full-table scans.”

http://books.google.com/books?id=bxHDtttb0ZAC&pg=PA566#v=onepage&q=&f=false

“The most important parameter is the optimizer_index_cost_adj, and the default setting of 100 is incorrect for most Oracle systems.  For OLTP systems, resetting the parameter to a smaller value (between 10 and 30) may result in huge performance gains as SQL statements change from large-table full-table scans to index range scans.”

http://willgreene.net/siebel/Siebel%20Knowledgebase/Performance%20Tuning%20Guide%20for%20Siebel%20on%20Oracle.pdf

“OPTIMIZER_INDEX_COST_ADJ – Controls the access path selection to be more or less index friendly.  Recommended Value = 1

For Oracle 9i CBO setting OPTIMIZER_INDEX_COST_ADJ = 1 is imperative. This will set the optimizer goal for best response time (versus best throughput). Incorrect setting may cause the optimizer to favor full-table scans instead of index access.
• For Oracle 10g/11g CBO setting OPTIMIZER_INDEX_COST_ADJ = 1 is also recommended, although the default setting of 100 will deliver good results in most cases. It is important to understand that in-house tuning of Siebel CRM application was performed with OPTIMIZER_INDEX_COST_ADJ = 1 setting. This means that customers who want to implement OPTIMIZER_INDEX_COST_ADJ = 100 on Oracle 10g/11g will need to allocate extra development time for additional tuning that may be necessary.
Under no circumstances OPTIMIZER_INDEX_COST_ADJ parameter should be set to values other than 1 (Siebel recommended) or 100 (Oracle default on 10g/11g).”

 Below is a link to a test case, which shows that even when retrieving 0.06% of the rows from a 100 million row table it is potentially faster when a full table scan is used, rather than an index range scan (the index clustering factor was very high):
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/6c436cee329326ec

Before experimenting with this parameter take a look at the following:
http://books.google.com/books?id=b3DIkYO2gBQC&pg=PA185#v=onepage&q=&f=false

http://richardfoote.wordpress.com/2009/07/08/the-cbo-and-indexes-optimizer_index_cost_adj-part-i/

http://richardfoote.wordpress.com/2009/07/22/the-cbo-and-indexes-optimizer_index_cost_adj-part-ii/

http://richardfoote.wordpress.com/2009/08/20/the-cbo-and-indexes-optimizer_index_cost_adj-part-iii/

http://forums.oracle.com/forums/thread.jspa?messageID=3917291 (Joze Senegacnik)

“The parameter optimizer_index_caching is considered by CBO for IN list and NESTED LOOP operations and should be set. The optimizer_index_cost_adj could be very dangerous unless you really know what you are doing, especially when system statistics is used. It was introduced in 8i, but in later releases one should use system statistics to tell CBO what is the timing difference between single block and multiple block I/O. Jonathan Lewis has written about this on his site.

What I really hate is that someone says: set this parameter to a certain value without considering what kind of system is that and what is even more important: without considering what are the consequences of such setting. It is like taking a wrong medicine.

My personal opinion regarding optimizer_index_cost_adj is that one should leave it at 100 (default) and if you really understand the mechanism behind then you may experiment with it. Personally I would use it only at statement level by using OPT_PARAM hint if this would be really necessary. This way you don’t make a system wide change.”








Follow

Get every new post delivered to your Inbox.

Join 142 other followers