June 8, 2011
I have not had published a “What is Wrong with this Quote” blog article for a couple of months, so I thought that I would publish a few of those types of articles.
The first quote is from the Alpha copy of the book “Oracle 11g Performance Tuning Recipes”, recipe 5-7 Resolving Log File Sync Waits. As always, I try to minimize the amount of text that I post for a quote, but a sufficient amount so that the context of the quote is not completely lost. The quote:
“As the log file sync wait event is caused by contention caused by the LGWR process, see if you can use the NOLOGGING option to get rid of these waits. Of course, in a production system, you can’t use the NOLOGGING option when the database is processing user requests, so this option is of limited use in most cases.”
“Although the database automatically calculates the value of the log_io_size parameter, you can specify a value for it, by issuing a command such as the following:
SQL> alter system set "_log_io_size"=1024000 scope=spfile; System altered. "
What, if anything, is wrong with the above quote?
Hint for the first half of the quote: AskTom, Documentation
Hint for the second half of the quote: The scale for the _log_io_size parameter is NOT bytes. This article may also be helpful.
The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.
I see that several people have started working on a response for this article, having visited the AskTom and Documentation link. How about another hint for the second half of the quote, looking at the value of that parameter for Oracle Database 10.2.0.2 and 11.2.0.2:
10.2.0.2:
11.2.0.2:
Without checking your reference, I’d say we have still redo-information. I’d expect all the redo needed for UNDO TS/blocks is not affected by NOLOGGING at all?
For the 2nd part: as you showed this parameter as “redo blocks” (512 or 1024 byte, I guess; depends on OS?) I’d say 1/2 or 1GB is big. I just did not find a reference to the size of _log_io_size in the quote – so why did you ask ?
Now I’m going to read your links 🙂
Martin,
I think that there is still more to say about the second half, and the first half for that matter. Thanks for setting the foundation upon which other people will likely add.
I think that I have about half of a page of typed review notes for recipe 5-7 (this is NOT a good sign) – I hope that this section of the book is still a work in process. Regarding the “redo block” size, yes, although I thought that I heard that the “redo block” size was set to change soon to 4096 bytes. Reference to the book “Oracle Wait Interface: a Practical Guide to Performance Diagnostics & Tuning” regarding this parameter:
http://books.google.com/books?id=14OmJzfCfXMC&pg=PA137#v=onepage&q&f=false
True, the book (in this case) did not tell the reader to modify a hidden initialization parameter without warning – it described a non-existent parameter, and then demonstrated why people should not just modify those parameters on a whim. 🙂
Using just remnants of what I once (2001) learned from ixora.co.au:
> As the log file sync wait event is caused by contention caused by the LGWR process,
Wrong. This event is caused by either a session issuing a commit, or DBWR wanting to checkpoint a block. LGWR just does the work.
> see if you can use the NOLOGGING option to get rid of these waits.
Wrong. See above – the rate of redo generation has nothing to do with ‘log file sync’ waits. Rate of commits does. Place open your database in read-only mode, and you will get rid of these (and many other) events.
Unclear: what does “use the NOLOGGING option” mean? Is this explained in the surrounding text/chapters?
>Of course, in a production system, you can’t use the NOLOGGING option when the database is processing user requests,
Unclear: why not?
>so this option is of limited use in most cases
That might be true, depending on the value of “most” and the definition of “use”. If “use” is “get rid of log file sync waits”, then “most” should be 0 for this to be correct.
And does it say why one might want to get rid of ‘log file sync’ waits? These are pretty harmless waits in most (>>0) cases, I think.
Am I missing the point of the post again? 😉
Cheers,
Flado
Flado,
Well stated, and combined with the comment provided by Martin probably 90% of the thoughts that ran through my mind when I read these two quotes have been covered. The “NOLOGGING option” was not defined in the book, so I made an assumption that the authors intended the “NOLOGGING” keyword in the CREATE TABLE and INSERT INTO … SELECT statement found in this comment:
https://hoopercharles.wordpress.com/2011/06/07/vsessions-row_wait_obj-column-shows-the-data_object_id-verify-the-documentation-challenge/#comment-3420
It suprised me a bit to see the session waiting on “log buffer space” and “log file switch completion” wait events considering that this particular 11.2.0.2 test database is running with:
Why can’t I use that NOLOGGING keyword when other users are adding or updating data in the same database – a database typically has more than one table, right? Finally, is it the database that processes the user requests, or is a database a set of files on disk?
Why would it surprise you that a NOARCHIVELOG database can experience these two waits? Log buffer operations have nothing to do with log file archiving. “log buffer space” only means that there is currently no space in the buffer, so the session has to wait for LGWR to flush it. “log file switch completion” is a variant of “log buffer space” specifying the reason why there’s no space in the buffer (because LGWR is busy switching the log file). In turn, there are two variants of “log file switch completion”, detailing just why it takes so long to switch the file – “log file switch completion (checkpoint incomplete)” and “log file switch completion (archiving needed)”. Of all these, it’s only the last one I would be surprised to see in a NOARCHIVELOG database. In your test, session 1 was generating redo at an alarming rate, so I would consider it normal for it to hit those two events from time to time.
http://www.ixora.com.au/notes/
This is the best explanation of redo internals I have ever seen. Ten years ago (don’t believe the timestamp on that page – The Wayback Machine has crawled it as early as Jan 2002) these notes made me realize that there is no mystery in how Oracle works and it is just a program built using the same building blocks as the programs I was writing at the time. I’d buy Steve Adams a beer were I to ever meet him 🙂
Cheers,
Flado
Flado,
There is always the chance that I am misunderstanding something (or overlooking an important detail).
http://download.oracle.com/docs/cd/E18283_01/server.112/e10578/tdpdw_bandr.htm#BFGIJDAH
So, a NOLOGGING operation should generate minimal redo, and that is why I was surprised to see “log buffer space” and “log file switch completion” waits. Did I by chance incorrectly specify NOLOGGING, this is the SQL statement (did I need an APPEND hint to make the NOLOGGING specication have an effect?):
The AskTom article linked at the start of this blog article includes the following quote:
As I stated, there is always the chance that I am misunderstanding something (or overlooking an important detail).
Yes, I think you need /*+ append */ if you want the absolute minimum of redo generated (that for moving the HWM and associated undo, plus eventual delayed block cleanouts on the dictionary tables). Conventional insert with nologging still generates undo which needs redo.
Or I may have overlooked yet another detail and you did generate just a tiny bit of redo but had the bad luck of filling up the last buffer before the switch
Anyway, if you repeat the test (just Session 1) with autotrace on, it should indeed be easy to see.
Cheers,
Flado
Flado,
Good suggestion. See the script below and the output… “ORA-12838: cannot read/modify an object after modifying it in parallel” triggered by AUTOTRACE. 🙂
Duplicate previous script:
Duplicate previous script with APPEND hint:
The output will follow.
Duplicate previous script output:
Duplicate previous script with APPEND hint output:
Yeah – generating redo at an alarming rate (3.7 GB per execution, if I’m counting correctly on the phone) 🙂 Or am I missing something again?
I vaguely remember that a nologging operation would generate something called “invalidation redo” in a way that a direct load wouldn’t, but that may be a myth or an obsolete state of affairs. Do you know something about that?
Flado,
Yes, roughly 3.7GB, decreased to roughly 2MB with the APPEND hint.
I do not know enough about “invalidation redo” to answer yes or no, so I did a bit of a Google search.
A quick check of the test table and another table in the database:
One of the first threads that mentioned “invalidation redo”:
http://www.freelists.org/post/oracle-l/Tracking-down-NOLOGGING-objects,9
Riyaj Shamsudeen
How to find objects creating nologging changes?
http://orainternals.wordpress.com/2008/09/11/how-to-find-objects-creating-nologging-changes/
Riyaj Shamsudeen
REDO INTERNALS AND TUNING BY REDO REDUCTION
Click to access riyaj_redo_internals_and_tuning_by_redo_reduction_doc.pdf
Riyaj Shamsudeen
Debunking the myths about redo, undo, commit & rollback
Click to access riyaj_debunking_myths_about_redo_ppt.pdf
http://download.oracle.com/docs/cd/E11882_01/server.112/e16541/parallel007.htm