Optimizer Costing 2 – What is Wrong with this Quote?

6 12 2010

December 6, 2010

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

I recently reviewed the book “Oracle Tuning the Definitive Reference Second Edition”, and did not provide an in-depth technical review of the entire book.  This blog article series will dig into some of the pages that were not specifically included in the review.  What is wrong with the following quote from page 527 of the book (this is an atypically long quote – the context would be lost if either of the two paragraphs were removed):

“In general, the Oracle CBO is unaware of buffer details, except when the optimizer_index_caching parameter is set where using multiple data buffers will not impact SQL execution plans.  When data using the new cpu_cost reported metric, the Oracle SQL optimizer builds the SQL plan decision tree based on the execution plan that will have the lowest estimated CPU cost.  For example, if a 32k data buffer is implemented for the index tablespace, the DBA can ensure that the indexes are cached for optimal performance and minimal logical I/O in range scans.

For example, if a database has 50 gigabytes of index space, a 60-gigabyte db_32k_cache_size can be defined and then the optimizer_index_caching parameter can be set to 100, telling the SQL optimizer that all of the Oracle indexes reside in RAM.  When Oracle makes the index versus table scan decision, knowing that the index nodes are in RAM will greatly influence the optimizer because the CBO knows that a logical I/O is often 100 times faster than a physical read.”

What, if anything, is wrong with the above quote from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true.  It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past).  If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.

As I stated in a comment in the earlier blog article, I would like to clarify that my review on the Amazon site is not intended to suggest that people should not buy the book. As the publisher’s website states that the book is written for senior Oracle DBAs, I suggest that senior DBAs, Oracle Certified Masters, and other people who are able to research the book’s contents *should* buy the book and post reviews of the book that highlight specific content of the book indicating whether or not that content is correctly stated. Such feedback will be a benefit to the Oracle community, and could help to improve Oracle Database books that are published in the future. I think that we need to keep a positive outlook in order to see things improve.

A Google book search indicates that a slightly modified version of the statement appeared in a couple of  other books:

Other pages found during a Google search of the phrase:

  • dba-oracle.com/oracle_tips_multiple_blocksizes.htm
  • rampant-books.com/t_oracle_improve_sql_execution_speed.htm

Related Oracle Database documentation:


Actions

Information

4 responses

6 12 2010
Martin Berger

Well, where to start?
* First it’s really hard to say for sure the CBO is unaware of buffer details. Even I’m quite sure this is the true (all my observations reflected this) I have never read the corresponding code or got a statement by an Oracle representative who is aware of CBO internals.
If we consider the CBO IS unaware of buffer details, it is ALLWAYS unaware, regardless the setting of optimizer_index_caching.
* data does not use the cpu_cost metric (my data does not do anything it is passive all the time). But even if the CPU cost model is used (which is default in 10g+) the CBO does not search for the lowest estimated CPU cost. The CPU cost model takes the performance of the IO subsystem into acount also. [1]
* (afaik) there is NO way to tell the optimizer anything about RAM or the cache! Even excessive use of db_32k_cache_size will only make the index blocks be in the cache more likely (no word about the initial ‘load_it_into_the_cache’ run?) and optimizer_index_caching tells the optimizer to calculate the index extremely cheap. No bit of info about RAM or cache.
* in the first sentence it was claimed the CBO does not know anything about buffers, so how should the CBO know anything about the difference of physical and logical I/O as clamed in the last sentence?
I hope I did not introduce to many errors here.

[1] Troubleshooting Oracle Performance – p111

6 12 2010
Charles Hooper

Martin,

A very thorough answer, and you included a couple of items that I had not considered (or had not worked out in the same level of detail).

I believe that there are still a few remaining problems with the quote.

6 12 2010
Magnus Fagertun

From the documentation: OPTIMIZER_INDEX_CACHING lets you adjust the behavior of cost-based optimization to favor nested loops joins and IN-list iterators.
I see two main problems with this:

I see two main problems with this

1) Yes, the indexes will be cached, but if you are doing index-lookup and then table access on disk the nested loops will probably go slow
2) If there ever was, now there are no true OLTP systems left. You are either doing reporting based on live data, or you are moving the data into a data warehouse. For this you need large efficient scans of tables/indexes, you will not get this by nested loop joins.

I also wonder what this setting will do if combined with system statistics… Why not leave it at default? And why do you want to fragment the buffer cache by setting several block sizes?

Magnus

8 12 2010
Charles Hooper

I tried to be very careful when quoting the book. One of the problems with the quoted section is the English syntax used in the quoted section of the book. It almost seems that the author is suggesting to only modify the OPTIMIZER_INDEX_CACHING parameter when the DBA has failed to affect the execution plans by implementing multiple block sizes in the database – but I am not sure exactly what is intended by the first sentence.

Another potential problem is the author’s apparent suggestion that ALL index blocks should be fully cached in memory, and located in a 32KB block size tablespace. Sure, just throw 60GB of memory to cache 50GB worth of indexes. While we are at it, let’s thrown another 200GB at the buffer cache that will be used by the table blocks and the indexes in the SYS schema (or were we supposed to move those also?). If we scale this up to a much larger database, should we do the same if we have 2TB of space occupied by the indexes?

Leave a reply to Charles Hooper Cancel reply