Statspack/AWR Report Resources

14 12 2009

December 14, 2009

On July 15, 2009, the following request appeared on the OTN forums:
http://forums.oracle.com/forums/thread.jspa?threadID=929170

Please provide me with a good link for understanding the AWR report.

One of the responders in the thread provided a link to the following document:
http://mhprofessional.com/downloads/products/0072263059/0072263059_ch14.pdf

– –

I spent about 10 minutes (OK, maybe 30-45 minutes) skim reading the 71 pages in that chapter. I found several items in the chapter that made me ask, is that correct? The dents in my desk from my forehead indicate that the author probably should have had someone else proofread the chapter. As the book was published in 2007, it should have been written to cover Oracle 10g R2 (10.2.0.x) – it did mention Oracle 10g R2. In my skim reading I found the following interesting quotes that I posted in that OTN thread without explanation as to why I selected the quotes (I thought someone would ask):

#1 Page 23:

“Some DBAs (usually those trying to sell you a tuning product) minimize the importance of hit ratios (proactive tuning) and focus completely on waits (reactive tuning), since focusing on waits is a great way to quickly solve the current burning problems. By monitoring the Instance Efficiency section (and using all of STATSPACK and Enterprise Manager), the DBA will combine reactive and proactive tuning and will find some problems before the users scream or wait events hit the top 5 list. Hit ratios are one important piece of the puzzle (so are waits).”

#2 Page 24:

“Hit ratios are a great barometer of the health of your system. A large increase or drop from day to day is an indicator of a major change that needs to be investigated.”

#3 Page 27:

“Try to cache small tables to avoid reading them into memory over and over again. Locate the data on disk systems that have either more disk caching or are buffered by the OS file system cache. DB_FILE_MULTIBLOCK_READ_COUNT can make full scans faster (but it could also influence Oracle to do more of them).”

#4 Page 42:

“Consistent gets: The number of blocks read from the buffer cache for queries without the SELECT FOR UPDATE clause. The value for this statistic plus the value of the “db block gets” statistic constitute what is referred to as logical reads (all reads cached in memory). These are usually the CURRENT version of the block, but it can also be a Consistent Read (CR) version.”

#5 Page 42:

“Hit Ratio = (Logical Reads – Physical Reads) / Logical Reads” “The buffer hit ratio should be above 95 percent. If it is less than 95 percent, you should consider increasing the size of the data cache by increasing the DB_CACHE_SIZE initialization parameter (given that physical memory is available to do this).”

#6 Page 43:

“If chained rows are indicated, the problem needs to be fixed as soon as possible. Chained rows can cause severe degradation of performance if a large number of rows are chained.”

#7 Page 44:

“The parameter that can be set in the init.ora to help improve the read time is the DB_FILE_MULTIBLOCK_READ_COUNT parameter, which controls the number of blocks that can be read in one I/O when a full table scan is being performed. This can reduce the number of I/Os needed to scan a table, thus improving the performance of the full table scan. Unfortunately, the optimizer might do more full table scans as a result of setting DB_FILE_MULTIBLOCK_READ_COUNT (you don’t want this behavior), so you may also need to set the OPTIMIZER_INDEX_COST_ADJ to a number, such as 10, to eliminate this problem and drive the use of indexes.”


I did not really look at the screenshots, nor try to see whether the majority of the information in the chapter was correct. But now looking at the first picture on page 20, I see that the author decided to create an AWR/Statspack report for a 23.5 hour time period – a very bad idea as short-term, intense problems are completely lost in the averages of the long duration report. The author apparently never mentioned that this was a bad idea.

Why did I select the quotes?

#1 Hit ratios are mostly meaningless. If the buffer cache hit ratio is very close to 100% (or any other number, for that matter), is that good, bad, or unknown? I have examples that show a 100% buffer cache hit ratio, yet very, very bad performance was present. See these threads for discussions of the buffer cache hit ratio:
http://forums.oracle.com/forums/thread.jspa?threadID=582929&start=0&tstart=0
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/1a946dbe8dcfa71e
 

#2 If the buffer cache hit ratio gets closer to 100% is that good? If it falls closer to 80% is that bad? If remains the same is that good or bad? Maybe the value predicts the weather?
 

#3 Small tables repeatedly read by full tablescans are more likely to remain in the Oracle buffer cache by default. Moving the tables to different disks promotes “hot” disks and “cold” disks in the system, rather than spreading the read and write access evenly across all disks, as is the principle of the Oracle SAME configuration. As of Oracle 10g R2, the DB_FILE_MULTIBLOCK_READ_COUNT parameter should not be set – instead Oracle will automatically set it to a value which attempts to maximize the read size based on the operating system, typically 1MB, or a value of 128 with an 8KB block size. As of Oracle 10g R1 (10.1.x), CPU costing (NOWORKLOAD) is enabled by default. With WORKLOAD CPU costing, the MBRC value from SYS.AUX_STATS$ determines the calculated cost of a tablescan, not DB_FILE_MULTIBLOCK_READ_COUNT (DB_FILE_MULTIBLOCK_READ_COUNT may still have an effect if the DBA has not yet collected WORKLOAD CPU statistics with a command like the following:

EXEC DBMS_STATS.GATHER_SYSTEM_STATS('interval',interval=>60) 

#4 An absolutely confusing definition of consistent gets, which are different from current mode gets. The consistent get, while coming from memory, may have involved one or more physical disk reads. The “db block gets” statistic is actually the same thing as a current mode get. A consistent get attempts to build a version of a block as of a specific time, or more accurately, as of a specific SCN through the application of zero, one, two, or more undo blocks. The specific SCN to which the block is rolled back is dependent on the current isolation level specified for the session. A current mode get will contain both committed and uncommitted data since it is the version of the block as it exists at that instant. See these blog posts:
http://jonathanlewis.wordpress.com/2009/06/09/quiz-night/
http://jonathanlewis.wordpress.com/2009/06/12/consistent-gets-2/

#5 Even if the buffer cache hit ratio were useful, that is not the correct formula to calculate the value. Per the Oracle 10g R2 Performance Tuning Guide (and the 11g R2 Performance Tuning Guide), the formula is:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/memory.htm

1 - (('physical reads cache') / ('consistent gets from cache' + 'db block gets from cache')

#6 Chained rows happen when a row is too large to fit in a single database block. It would be necessary to recreate the database with a larger block size, or add a new tablespace with a larger block size and make changes in the initialization parameters to set aside memory for the non-default block size and then move the objects into the new tablespace – this is not a recommended approach:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/logical.htm#i15914 
http://download.oracle.com/docs/cd/E11882_01/server.112/e16638/iodesign.htm 

#7 Init.ora files started falling out of favor with Oracle 9i R1 (9.0.1.x), instead replaced with spfiles that allow initialization parameters to be permanently modified with ALTER SYSTEM commands. In addition to controlling the maximum number of blocks read during a full table scan, the DB_FILE_MULTIBLOCK_READ_COUNT parameter also controls the maximum number of blocks read during a fast full index scan. Full tablescans, especially if they complete faster, are not necessarily evil. The need to adjust OPTIMIZER_INDEX_COST_ADJ parameter decreased significantly with the introduction of CPU costing (introduced with Oracle 9i, enabled by default with 10g). Setting OPTIMIZER_INDEX_COST_ADJ to a very low value, such as 10, decreases the calculated costs of index accesses to 10% of their original value – this means that two different index access methods could then have the same rounded calculated costs, yet have very different execution speeds.

After I wrote the above quotes from the link into the thread, I remembered that someone else provided a critical review of this document. Reading this review, I see that I missed a lot of less obvious problems with the book chapter (I admit looking for the low hanging fruit):
http://www.amazon.co.uk/Oracle-Database-Performance-Techniques-Osborne/dp/0072263059

In case you are left wondering where to look for Statspack/AWR information, I suggest starting here when trying to learn how to read AWR/Statspack reports:
http://jonathanlewis.wordpress.com/2006/11/29/analysing-statspack-pt1/ 
http://www.oracle.com/technology/deploy/performance/pdf/statspack_opm4.pdf 
https://hoopercharles.wordpress.com/2009/11/30/11-2-0-1-statspack-report-contents/

Other Resources:
Metalink Doc ID 228913.1 “Systemwide Tuning using STATSPACK Reports”
Metalink Doc ID 94224.1 “FAQ- Statspack Complete Reference”

Other Topics that Might be of Interest:
Metalink Doc ID 390374.1 “Oracle Performance Diagnostic Guide”
Metalink Doc ID 438452.1 “Performance Tools Quick Reference Guide”