January 4, 2012
(Back to the Previous Post in the Series)
I thought that I would begin this blog article with a bit of a test case that demonstrates a very simple deadlock between two sessions.
In Session 1 (create the test table with a primary key and insert a single row):
CREATE TABLE T1 ( C1 NUMBER PRIMARY KEY, C2 VARCHAR2(10)); INSERT INTO T1 VALUES (1,'1');
In Session 2 (insert a row that does not conflict with session 1 and then insert a row with the same primary key value as the uncommitted row in Session 1):
INSERT INTO T1 VALUES (2,'2'); INSERT INTO T1 VALUES (1,'3');
(Session 2 hangs while waiting for Session 1 to COMMIT or ROLLBACK)
In Session 1 (insert a row with the same primary key value as the uncommitted row in Session 2):
INSERT INTO T1 VALUES (2,'2');
(Session 1 hangs while waiting for Session 2 to COMMIT or ROLLBACK)
In Session 2 (an ORA-00060 is returned, deadlock between the two sessions ends):
INSERT INTO T1 VALUES (1,'3') * ERROR at line 1: ORA-00060: deadlock detected while waiting for resource
(Session 1 still hangs while waiting for Session 2 to COMMIT or ROLLBACK the first row that was inserted)
In Session 2 (confirming that the session was not killed/terminated and that the first row inserted still exists – indicates that the current TRANSACTION was not rolled back, just the statement that was involved in the deadlock):
SELECT * FROM T1; C1 C2 ---------- ---------- 2 2
While recently reading an OTN thread I was reminded of a bit of information that several sources simply miss-state about Oracle deadlocks. The incorrect information is repeated, seemingly quite frequently, in various books and forum threads. As luck would have it, several of the books that I have read about Oracle Database also repeat this incorrect information. For instance, a quote from page 352 of the book “Expert Oracle Database 11g Administration” (a book that I reviewed and gave 4 out of 5 stars, maybe I need to re-evaluate that rating?):
“This is a catch-22 situation, because the stalemate can’t be broken by either session unilaterally. In such circumstances, Oracle steps in, terminates one of the sessions, and rolls back its statement. Oracle quickly recognizes that two sessions are deadlocked and terminates the session that issued the most recent request for a lock. This will release the object locks that the other session is waiting for.”
A second book, “Secrets of the Oracle Database” (another book that I reviewed and gave 4 out of 5 stars, actually it was probably worth about 4.4 stars), states the following on page 57:
“Hence, the ORACLE DBMS detects circular chains pertaining to interdependent locks, signals the error ‘ORA-00060: deadlock detected while waiting for resource’, and rolls back one of the sessions involved in the would-be deadlock.”
A third book, “Oracle SQL Recipes” (another book that I reviewed and gave 4 out of 5 stars), states the following on page 217:
“The immediate solution to the problem requires no user intervention: Oracle detects the deadlock and automatically rolls back one of the transactions, releasing one of the sets of row locks so that the other session’s DML will complete.”
A fourth book, “Oracle Database 11g Performance Tuning Recipes” (another book that I reviewed and gave 2 out of 5 stars), states on page 227:
“While the database can automatically resolve deadlocks between sessions (by killing one of the sessions holding a needed lock)…”
The Google book search found that the above four books were not alone in their statements of deadlocks that do not match the test case results at the start of this blog article. One of the books found by Google is one that I have not read yet (and probably never will), “Oracle PL/SQL for Dummies”, which states the following on page 298:
“This is a deadlock. Oracle resolves it by raising a deadlock exception (ORA-00060: deadlock detected while waiting for resource) in user 1’s session. This terminates the procedure and allows user 2’s procedure to successfully complete.”
A second book found by the Google search is a bit old, “Oracle 9i DBA 101”, which on page 194 states:
“A deadlock occurs when two or more users wait for data that is locked by each other. Oracle will kill the first server process that detects the deadlock, so you should always look for and fix code that can cause deadlocks.”
There are certainly many other statements in other books regarding deadlocks, both correct and incorrect. A quick search through the OTN forums found several threads in addition to the one at the start of this blog article that also drifted a bit from an accurate picture of what happens when a deadlock is broken in Oracle Database.
https://forums.oracle.com/forums/thread.jspa?threadID=2196282
“That is the deadlock. Oracle will then immediately choose arbitrarlly a victim session (542 or 548) and kill its process letting the remaining session continuing its work.”
https://forums.oracle.com/forums/thread.jspa?threadID=2152646
“There should be a simple wait, not a deadlock which would killed one of the session.”
https://forums.oracle.com/forums/thread.jspa?threadID=2261367
“If two sessions are actually deadlocked, Oracle should always kill one of the blocked sessions, raising an ORA-00060 error. It sounds like you’re saying this is not the expected behavior.”
https://forums.oracle.com/forums/thread.jspa?threadID=2227420
“The two session are inserting records in the same table with a unique index on it. With the same database version on another server and the same treatment, the deadlocks are correctly handled (one session is killed with ora-00060) and the process can continue.”
https://forums.oracle.com/forums/thread.jspa?threadID=1072665
“In oracle, when session1 and session2 make a deadlock, the oracle will kill one session , but I want know how to let oracle kill session1 and session2 both, is there way to do that?”
https://forums.oracle.com/forums/thread.jspa?threadID=2169563
“NO. Oracle already broke the deadlock by terminating one of these sessions. By the time trace file was closed, the deadlock no longer existed.”
https://forums.oracle.com/forums/thread.jspa?threadID=953308
“Oracle itself resolve the deadlock by killing the session.”
—
While reading the book “Oracle Core Essential Internals for DBAs and Developers” I almost flagged a sentence in a discussion of deadlocks as potentially containing an error. That statement is found on page 82 near the end of the third to the last paragraph on the page. After reading that paragraph about 10 times I decided that the statement is probably specific to TM locks, and probably is correct (the thought of TX locks maintained at the individual block level that contains the table rows kept running through my head the first nine times I read the paragraph). This section of the book is interesting, as it not only avoids the mistakes mentioned in the above books, but it also highlights an inaccurate statement or two found in the Oracle Database documentation.
I wonder if the misconception arises because of the way some client applications handle the error. A badly coded application which crashes on receiving the ORA-00060 usually because there is poor or no error handling could lead to the assumption that oracle has killed the session rather than the client application crashing. We have a wonderful third party application here that on any ORA-* message crashes and dumps a stack trace, as the trace then shows the crash occurred during a database call its the DBA’s fault not the developers.
Chris,
I think that you are right – it is a bit of a mistaken cause and effect relationship, where the application’s reaction to the ORA error is missing element.
Charles,
In the third OTN link you’ve listed, I gave the following answer
”
When deadlock occurs Oracle will not roll back the entire transaction. It will only do a rollback of one of the statements (belonging to one of the sessions) involded in the deadlock. The other session remains blocked if the session for which one of the statement has been rolled back by Oracle will not commit or rollback.
In other words, in the Justin Cave example, Session 1 will remains blocked if Session 2 will not issue a rollback after it has received the deadlook error
So, the OP should verify that his application must forsee a rollback in case of exception (and the deadlock error could be one of those exceptions)
Hope this helps
”
.
Mohamed,
What you quoted above is very well worded – and matches the test case results at the start of this article. Thank you for posting the quote.
There is also a misconception about oracle choosing ARBITRARILY one of the two sessions involved in the deadlocks and rollback one of its statement. The choosen session is not done arbitrarily but it seems that it is the session that start first waiting
http://jonathanlewis.wordpress.com/2011/08/29/deadlock-2/
Another piece of the puzzle. Jonathan mentioned something similar in the Oracle Core book – and his writing style made it seem so obvious that it works as he indicated, that I had a hard time understanding why I did not understand the reasoning earlier. 🙂
Nice. Thank you for sharing.
Hi Chalres, But the “enq: TM contention” did’nt killed the session for me:
session1:
session2:
(session2 hangs here)
session3:
pvh,
It was a good idea to try a deadlock involving a TM enqueue. You can find another interesting example of a TM enqueue here:
https://hoopercharles.wordpress.com/2010/01/07/deadlock-on-oracle-11g-but-not-on-10g/
Please keep in mind that a TM enqueue will not necessarily lead to a deadlock, and that a deadlock involving a TM equeue will not kill the sessions that ends up being the deadlock “victim” (this might not have been clear because of all of the incorrect quotes from the various books).
I had a little bit of trouble when I tried to recreate your test case:
In Session 1:
In Session 2:
I could not create the DEPT table in Session 2 because Session 1 was already a TX lock on the EMP table. We need to change the order of the commands a little.
Fixing up the order of the commands and modifying the script to actually trigger a deadlock:
In Session 1:
In Session 2:
In Session 1:
In Session 2:
(Session 2 is hung)
In Session 1:
Session 2 Shows:
(Session 1 is still hung)
In Session 2:
The above shows that Session 2 was still connected to the database, and that the first row inserted by Session 2 was not rolled back by an automatic transaction rollback (there was an automatic statement-level rollback).
This is the deadlock graph that was written to the trace file:
Thanks for the inputs Charles, I will check that out.