Log File Sync – What is Wrong with this Quote?

16 06 2010

June 16, 2010

While reading the June 2010 printing of the “Oracle Performance Firefighting” book I found an interesting paragraph that left me wondering if something is missing.  This quote is from page 61 (this is a long quote, required to not lose the context of the discussion topic – please excuse any typos):

“As another example, suppose that you receive a call about long commit times. The users are saying ‘submits’ or ‘saves’ are taking longer today than yesterday. Upon an examination of the average log file sync wait times (which indicate commit times from an Oracle perspective), you discover there is no change from the previous day. Therefore, you know the performance issue is not because of Oracle’s commit mechanism and most likely not related to the database.”  [This period marks the end of Part 3, "IO Wait Time Summary with Event Details"]

What, if anything, is wrong with the above quote from the book? 

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true.  It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past).  If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.


Actions

Information

9 responses

16 06 2010
Frits Hoogland

Averages hide outliers. This means that if there are long taking commits, but about the same amount of commits which take much shorter time, the average can be the same (of a previous day with satisfactory commit times). It’s like the analogy of one foot in a bath with boiling water, and the other one in a bath with ice. The average temperature of the two is convenient, so it must be convenient.

The text also seems to steer away from measuring, whilst that will be the only way to determine if something is the matter.

16 06 2010
Centinul

The author may be correct in that it may not be a problem with the “commit mechanism.” What would stop this from being a problem with any other Oracle mechanism? Maybe the execution plan changed between the time when the system was performing acceptably and when it wasn’t. I would suspect that there are a number of possible causes that would ultimately be perceived in the same way by the users.

This particular example is a perfect reason why end-to-end application tracing is necessary. How do we even know that the problem is database related or not? Does the time spent in the database (“DB Time”) account for a significant portion of the overall time spent? If so, more investigation by the DBA team is necessary.

16 06 2010
Charles Hooper

Great comments so far.

A parallel question, does:
User reported “Save” = COMMIT = log file sync wait?

I think that Centinul’s response is on the verge of answering the above parallel question.

16 06 2010
Chris_c

How likely is it the save button on the Application is just performing a commit? most likely its taking the changes the user made inserting/updating/deleting rows then commiting all tied to a single button press the commit here may be the smallest part of the operation itself.

16 06 2010
John Hallas

Apart from a techical discussion around the commit mechanism I would be very worried about any statement that start off “You know ” and ends with “… not related to the database”.
I would also want to understand the metrics and evidence behind ‘long commit times’. Is this a user perception or is committ taking longer than yesterday.

Presumably you have given something away with the title of the blog being ‘Log file sync’ and the evience of problems would this a leading wait event.

John

16 06 2010
kevinclosson

I agree with Chris_c. The nebulous terms “submit” and “save” may in fact relate to more database work than one would presume in the servicing of EXEC SQL COMMIT;

“Oracle’s commit mechanism” is a bit of a nickname nickname unless that has been defined elsewhere in the book. The schematic of a COMMIT is pretty complex and includes a lot of IPC and latching.

We are probably splitting hairs on this one.

17 06 2010
Charles Hooper

I thought that I would share the thoughts that I had about the quoted paragraph at the time I read it in the book. The comments so far in this blog article are very much on target, and in many cases are either more detailed or better thought through than what immediately crossed my mind when first seeing the paragraph.

The first thought was that the author was tired at the point when this paragraph was written and was looking forward to moving on to the next section, with the intention of looping back to this section of the book to fill in the missing details (I used this approach when working on the “Expert Oracle Practices” book). As it stands, the paragraph reads almost as if it is a DBA washing his hands of a problem and pushing the problem back into the corner of the network administrator or some other person in the company – because the DBA is claiming that because the COMMITs are fast it cannot possibly be a problem with the database server.

This brings me to the parallel question that I posed: is a user reported save (or submit) equal to an application COMMIT, and is a COMMIT the only thing that causes a ‘log file sync’ wait? A quick test to see if COMMITs are the only cause of ‘log file sync’ waits. The test script, after connecting to the database, drops an existing test table, creates a new test table, inserts 100 rows into the test table, and then issues a ROLLBACK:

DROP TABLE T1;
 
CREATE TABLE T1 AS
SELECT
  ROWNUM R
FROM
  DUAL;

SELECT
  SUBSTR(EVENT,1,15) EVENT,
  TOTAL_WAITS
FROM
  V$SESSION_EVENT
WHERE
  EVENT='log file sync'
  AND SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1);
  
EVENT           TOTAL_WAITS
--------------- -----------
log file sync             2

INSERT INTO
  T1
SELECT
  ROWNUM+1 R
FROM
  DUAL
CONNECT BY
  LEVEL<=100;
 
SELECT
  SUBSTR(EVENT,1,15) EVENT,
  TOTAL_WAITS
FROM
  V$SESSION_EVENT
WHERE
  EVENT='log file sync'
  AND SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1);
  
EVENT           TOTAL_WAITS
--------------- -----------
log file sync             2
 
ROLLBACK;
 
SELECT
  SUBSTR(EVENT,1,15) EVENT,
  TOTAL_WAITS
FROM
  V$SESSION_EVENT
WHERE
  EVENT='log file sync'
  AND SID=(SELECT SID FROM V$MYSTAT WHERE ROWNUM=1);
  
EVENT           TOTAL_WAITS
--------------- -----------
log file sync             3

The test shows that the implicit COMMIT for DDL, when dropping the table and creating the table, caused a ‘log file sync’ wait (impressive, but you knew that). The ROLLBACK also caused the log file sync wait (I wonder if the author considered this?).

Is a user reported ‘Save’ the same thing as a COMMIT? Two things crossed my mind when I thought about that:
1. I have written several programs that interact with Oracle databases, and the programs often have something similar to a ‘Save’ option that a user might select. What frequently happens when the Save command is executed? A transaction in the database is explicitly started (the programming environment would otherwise perform an implicit COMMIT after all DDL commands); data to be ‘Saved’ is validated which might involve selecting from various database tables or performing lookups in text files; rows in tables are deleted, inserted, and/or updated; if no errors/exceptions are reported in previous operations a COMMIT is executed otherwise a ROLLBACK is executed. The COMMIT is a very small portion of the total ‘Save’ operation.
2. I recalled reading in Tom Kyte’s “Expert Oracle Database Architecture” book something about the actual COMMIT operation being very fast compared to all of the steps leading up to the execution of the COMMIT. I think that it was later 2005 or early 2006 when I read that book, so I probably forgot several of the details.

I have pulled my second copy of the “Expert Oracle Database Architecture: 9i and 10g Programming Techniques and Solutions” off the bookshelf (which at the moment looks a lot like a coffee table). From pages 296 and 297:

“So, why is a COMMIT’s response time fairly flat, regardless of the transaction size? Before we even go to COMMIT in the database, we’ve already done the really hard work. We’ve already modified the data in the database, so we’ve already done 99.9 percent of the work…

As you can see, there is very little to do to process a COMMIT. The lengthiest operation is, and always will be, the activity performed by LGWR, as this is physical disk I/O. The amount of time spent by LGWR here will be greatly reduced by the fact that it has already been flushing the contents of the redo log buffer on a recurring basis…”

Much more may be found in Tom’s book related to why the actual COMMIT time is the wrong thing to measure.

Finally, averages hide details. What if the user is not doing exactly what she was doing yesterday? Maybe her session performed a greater percentage of ROLLBACK operations yesterday than COMMIT operations. (Pages 298-302 of Tom’s book describe why that might matter.)

24 06 2010
joel garry

I’m not following your #1 about “A transaction in the database is explicitly started (the programming environment would otherwise perform an implicit COMMIT after all DDL commands).” You’re saying DDL doesn’t do a commit before and after the DDL, regardless?

24 06 2010
Charles Hooper

Joel,

Thank you for catching that typo – DML (data manipulation language) was intended, not DDL (data definition language). Unless a transaction is explicitly started, some programming environments will perform an implicit commit after every DML statement. If someone is performing DDL in a typical ‘Save’ operation something is probably very wrong.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 142 other followers

%d bloggers like this: