November 30, 2009 (Follow-up Jan 1, 3, 4, 6, 8, 9, 12, 15, 20, 23, 28, Feb 12, 14, 16, Mar 21, May 7, Dec 28 2010)
I had the great pleasure of co-writing with Randolf Geist (http://oracle-randolf.blogspot.com/) two (very long) chapters for the “Expert Oracle Practices: Oracle Database Administration from the Oak Table” book. The chapter titles are “Understanding Performance Optimization Methods” and “Choosing a Performance Optimization Method”. The two chapters explore the vast number of ways to identify Oracle database performance problems, and demonstrate methods of attacking the problems once identified. Some of the techniques demonstrated in the chapters have not previously appeared in book form. The chapters also attempt to address some of the bad advice found when performing web searches on Oracle keywords. This blog, and that of Randolf Geist, provide a good idea of what our two chapters in the book cover – essentially everything from “guessing” based on what one would find through a Google search, to trying to use the buffer cache hit ratio, all the way to reading process stack dumps (and quite a number of other things in between). Several reproducible test cases are included in the chapters, with example output from Oracle 10.2.0.4, 188.8.131.52, 184.108.40.206, and 220.127.116.11.
The book description on the Apress website as well as Amazon seems to be a bit limited compared to the book’s table of contents, although the authors’ descriptions are thorough. Below is the table of contents for the book:
I DBA Fundamentals
1 Battle Against Any Guess – Alex Gorbachev
2 A Partly-cloudy Future – Jeremiah Wilton
3 Developing A Performance Optimization Method – Connie Green, Graham Wood, Uri Shaft
4 The DBA as Designer – Melanie Caffrey
II Network and Operating Systems
5 Running Oracle on Windows – Niall Litchfield
III SQL and PL/SQL
6 Managing SQL Performance – Karen Morton
7 PL/SQL and the CBO – Joze Senegacnik
IV Performance Optimization Methods
8 Understanding Performance Optimization Methods – Charles Hooper & Randolf Geist
9 Choosing a Performance Optimization Method – Randolf Geist & Charles Hooper
IV Operational Concerns
10 Managing the Very Large Database – Tim Gorman
11 Statistics Collection and Creation – Jonathan Lewis
12 Troubleshooting Latch Contention – Riyaj Shamsudeen
13 Measuring for Robust Performance – Robyn Sands
14 Securing Users – Pete Finnigan
15 Securing Data – Pete Finnigan
May 7, 2010:
An independent review of the Expert Oracle Practices book appeared in the May 2010 edition of the NoCOUG Journal, written by Dave Abercrombie (the same reviewer posted a short review on Amazon’s website). The review is thorough, and covers items that were missed in my brief chapter summaries that are below. Most of chapter 1, as well as an interview with the author of chapter 1, also appeared in the NoCOUG Journal.
Late follow-up (January 1, 2010):
I ordered a copy of the “Expert Oracle Practices: Oracle Database Administration from the Oak Table” book from Amazon in the middle of November. Today I bought an “Alpha” copy of the book from the Apress site so that I would have an opportunity to read the other chapters in the book (and so that I could have an electronic copy of the completed book). The “Alpha” copy of the book is a formatting mine-field – the victim of Microsoft Word “auto-correcting” document styles, missing mono-spaced fonts in the code sections, and yellow highlighter all over a couple of the chapters (there is a reason for the yellow highlighter all over our two chapters – the cyan highlighter was too hard on the eyes). The “Alpha” copy, at least for the two chapters that Randolf and I wrote, was captured just before we had an opportunity to address the concerns of the second editor (correction – the technical reviewer had made comments previously, the “Alpha” copy was captured before we had an opportunity to address the concerns of the first editor of the chapters – portions of the chapters were modified heavily based on his feedback) – there were three more rounds of technical edits after those concerns were addressed. Don’t let the rough edges of the “Alpha” copy color your opinion of the finished product.
Late follow-up (Jan 3-28, Feb 12, 14, 16, Mar 21 2010):
Below are descriptions of each chapter of the book.
Chapter 1 (Final Version) – Alex Gorbachev, the founder of Battle Against Any Guess (BAAG Party), identifies examples of attempted problem solving through guess work, throwing the book at the problem, and shooting the problem with silver bullets – as the attempted guess work is suggested in various online forums. He seems to have a strong bias against guessing and a bias for understanding the actual source of the problem. In the chapter he attempts to answer the question of why the database is slow every Monday when it rains, but does not address the issue of why another database runs slowly when someone yells at the SAN (there was a serious YouTube video floating around a couple of months ago that demonstrated verbally abusing the SAN reduced its performance). An interesting issue is raised in the chapter – is not (blindly) implementing a best practice actually a best practice – and is that advice then a circular reference that should be followed, or not followed?
Chapter 2 (Final Version) – Jeremiah Wilton, Amazon.com’s first DBA, provides insight into the meaning of cloud computing, describes the Software as a Service, Platform as a Service, and Infrastructure as a Service variants. Jeremiah also describes the capabilities of several cloud computing providers, and describes in detail how to set up Amazon.com’s EC2.
Chapter 3 (Final Version) – Connie Green, Graham Wood, and Uri Shaft (you will find two of their names in the spdoc.txt file in the rdbms/admin directory) – some of the developers of the performance tuning and monitoring features in Oracle Database, describe the philosophy behind the development of a sound performance optimization method, debunk mysteries and black magic, and clean up after silver bullets miss their target. The chapter lists five steps to developing a time-based optimization method. Describes when to use Enterprise Manager, Automatic Database Diagnostic Monitor (extended description), Active Session History (extended description), Automatic Workload Repository/Statspack reports (extended description), and SQL trace. The final version of the chapter extends beyond the philosophy stage – if the performance is not the same as it was yesterday, something must have changed. What is the scope of the problem: specific SQL statement, instance wide, or something else? The section covering the usage of the ASH (Active Session History) data to pinpoint the starting point of performance problems seems to have been extended. The chapter also describes what the DB time statistic really indicates, and how may it be used for troubleshooting.
Chapter 4 (Final Version) – Melanie Caffrey, Senior Development Manager in the Enterprise Linux group at Oracle, describes the philosophy of being an all-around fantastic, energetic, proactive DBA with the ultimate goal of having your evenings and weekends rarely interrupted by various database problems. This chapter is not a simple bulleted list of what to do and what not to do – the topics include:
- How a DBA should develop a partnership with the developers to reduce the chances of performance problems when new applications are released for production use.
- Difference between Design Upfront/Develop Later/Waterfall approach, the Agile approach, and the Cowboy approach to development.
- Choosing the correct data type with an example of what could happen when defining character columns as VARCHAR2(4000), potential problems when comparing CHAR columns with VARCHAR2 columns, and cardinality calculation problems when numbers are stored in VARCHAR2 columns.
- Points against database independent design.
- Building integrity and other logic into the database, rather than at the application level.
- Preventing the Oracle database from appearing as a black box for dumping data.
- Recommended Oracle documentation to read.
- Learning from your mistakes.
- Leveraging the Oracle features that are part of the database package – getting your money’s worth from this expensive program while improving performance.
Chapter 5 (Final Version) – Niall Litchfield answers the question: Is Windows a serious platform for Oracle databases? And if so, what does a Unix DBA/administrator need to know about running Oracle on the Windows platform. Architecture differences on Windows and Unix-like operating systems (CPU, Memory, Storage, Network) are described. Using SysInternal’s (Microsoft’s) Process Monitor to drill into Oracle activity for a single session, an overview of the Windows registry, and the Windows Scripting Host are also described.
Chapter 6 (Final Version) – Karen Morton states quite simply to begin at the beginning when trying to control SQL performance. The author of the SQL in the application must repeat: “I am responsible for the performance of the code I write or maintain.” The chapter explores some of the differences between EXPLAIN PLAN output and that of DBMS_XPLAN.DISPLAY_CURSOR. The chapter also provides an introduction to extended SQL traces (event 10046), and why extended SQL traces are better than execution plans generated by DBMS_XPLAN.DISPLAY_CURSOR. The chapter includes the investigation of a couple SQL performance problems – the case studies show how to detect that a problem is present and provide ideas for rectifying the problem: lack of a good index, skewed data, and SQL that should be rewritten because it is projected to perform poorly as the data volume increases. (The Final Version of the chapter provides an additional case study that demonstrates the effects of calling a PL/SQL function from SQL, shows that the timings in DBMS_XPLANs may be deceptive when such functions are called, and shows how to determine the actual impact of context switches caused by combining SQL with PL/SQL.)
Chapter 7 (Final Version) – Joze Senegacnik explains the stages of a SQL statement hard parse; defines selectivity, cardinality, and cost; then unravels the mysteries of bad execution plans caused by PL/SQL functions that are present in the WHERE clause – and as a bonus explains how to correct such problems.
Chapter 8 (Final Version) – Charles Hooper and Randolf Geist explain that there is so much instrumentation (to help optimization) in an Oracle RDBMS that it is difficult to determine which instrumentation method is most appropriate, especially if the available instrumentation methods are not well understood. This chapter attempts to cover a large number of the instrumentation and optimization methods including:
- Blindly changing system parameters based solely on information found on the Internet or books.
- Monitoring and reacting to the buffer cache hit ratio, or another similar ratio.
- Monitoring the delta (change) values of system and session statistics (V$SYSSTAT, V$SESSTAT, etc.).
- Monitoring file related activity (V$FILESTAT, V$TEMPSTAT, V$SEGSTAT, various system and session statistics, etc.).
- Monitoring the delta values of system and session wait events (V$SYSTEM_EVENT, V$SESSION_EVENT, V$SESSION_WAIT, etc.).
- Monitoring the CPU utilization (CPU used by this session system and session statistics, statistics from V$OSSTAT, statistics from V$SYS_TIME_MODEL and V$SESS_TIME_MODEL, etc.).
- Capturing Statspack snapshots and creating Statspack or AWR reports.
- Monitoring delta values of the statistics associated with each SQL statement in the library cache (CPU_TIME, ELAPSED_TIME, FETCHES, EXECUTIONS, BUFFER_GETS, DISK_READS, etc. in V$SQL).
- Examining plan statistics and changes in execution plans for SQL statements (V$SQL_PLAN_STATISTICS_ALL, V$SQL_WORKAREA_ACTIVE, V$SQL_SHARED_CURSOR, V$SQL_PLAN, DBMS_XPLAN.DISPLAY_CURSOR).
- Examining system and session level optimizer parameters affecting calculations performed while generating execution plans with the lowest calculated cost (V$SYS_OPTIMIZER_ENV, V$SES_OPTIMIZER_ENV).
- Generating 10053 cost based optimizer traces during hard parses in order to determine the calculated reason for an execution plan.
- Generating 10046 extended SQL traces at levels 4, 8, or 12 during the execution of individual SQL statements as well of normal application processes in order to determine specific causes of performance problems within critical application procedures.
- Examining Oracle database server stack traces.
- Reviewing the ADDM findings (with appropriate Enterprise Edition license and extra cost license options).
- Examining network packets and associated statistics (tcpdump, Wireshark/Ethereal, etc.).
- Examining client-side traces (sqlnet, Process Monitor, Spy++, etc.)
- Investigating enqueue type waits
- (And probably a couple other things along the way.)
Additionally, a couple of interesting test cases are introduced:
- Selecting 0.06% of the rows from a table (containing 100,000,000 rows) completes faster using a full table scan instead of an index range scan.
- Generating a deadlock on Oracle 18.104.22.168 through 22.214.171.124, while the exact same sequence of events fails to trigger a deadlock on Oracle 10.2.0.4.
- Executing a query using bind variables in a single session causes more than 1,024 child cursors to be created, resulting in multiple children with the same SQL_ID and CHILD_NUMBER to be created when the SQL statement is executed with 10,000 different sets of bind variable values.
- Head to head comparison of direct I/O compared to operating system buffered I/O with Oracle 126.96.36.199 during one of the test cases (and some of the potential pitfalls of each approach).
- What happens when a CPU load is applied inside and outside the database instance.
- TKPROF in 188.8.131.52 and above potentially shows the wrong Row Source Execution Plan when the same SQL statement is executed multiple times while tracing is enabled.
(Note that through the final proofing stages of the chapters, Chapter 8 was actually Chapter 9, and Chapter 9 was actually Chapter 10. The opening section for Chapter 8 in the final version of the book states the following: “Chapter 10, which follows next, provides a decision-making process that we believe will help you apply the right method to the right problem.” The same problem is present in Chapter 9 when it refers back to the concepts introduced in the previous chapter – Chapter 9 states to look at Chapter 9. If you read the words “Chapter 10”, please interpret that as “Chapter 9”. If you read the words “Chapter 9”, please interpret that as “Chapter 8”. A similar problem appears in Chapter 6, which refers to Listing 7-1, Listing 7-2, etc.)
(I just caught a typo in the script at the top of page 186 – I am sure that this typo was corrected in one of the 25-30 edits of our chapters, but that change was somehow lost and not noticed in the final series of edits. The script shows this “AND SS.OBJ#=DO.DATA_OBJECT_ID AND SS.DATAOBJ#=DO.DATA_OBJECT_ID” – it was supposed to show this “AND SS.OBJ#=DO.OBJECT_ID AND SS.DATAOBJ#=DO.DATA_OBJECT_ID”. Another typo is found on line 8 in the script on page 218 – change the second TE.DISK_READS to TS.DISK_READS.)
(Mar 21, 2010: The SQL statement on page 188 should have included the following in the WHERE clause to reduce the chances of repeated rows appearing in the output, see the March 22, 2010 blog article for an explanation: AND S.SQL_CHILD_NUMBER=SQL.CHILD_NUMBER – ideally, the query would return the S.SQL_CHILD_NUMBER column also.)
Chapter 9 (Final Version) – Randolf Geist and Charles Hooper use real-world examples to demonstrate when to use the various monitoring methods in pursuit of improved performance, switching between monitoring methods as needed. The chapter wraps up with various techniques to improve performance based on the data collected during database monitoring.
Chapter 10 (Final Version) – Tim Gorman describes what needs to be considered when managing very large databases. His chapter is about those massive databases that don’t quite fit onto the 12 drive RAID 10 floppy drive array. The databases of this variety are huge, growing at a rate of 25TB to 50TB per day. He describes the raw throughput required to hit this level of growth – it is not a write rate of 600MB/s that is required, but instead one of roughly 10GB/s (OK, how many EXADATA V2 units were required to maintain this write level?). The chapter opens with a story about Tim accidentally dropping a 15TB index on a 45TB table, and learning from one’s mistakes. And we later learn that “the fastest mass update or delete is actually an insert.” He describes some of the advantages of using the extra cost partitioning option, and ASM, and also describes some of the challenges of backing up massive databases – at a backup rate of 1,000GB per hour, that 5PB database will be backed up in no time (no time soon – 200 days by Tim’s calculations). After reading the chapter, I just have one question – is all of that data being generated just to prove that Oracle Databases are able to handle it, or is someone actually reading/using all of the data?
Chapter 11 (Final Version) – Jonathan Lewis attacks the subject of statistics. This is not a simple chapter that shows how to collect table, index, and system statistics. It is instead a chapter that exposes some of the problems that cause the cost-based optimizer to generate seemingly silly/stupid execution plans. Is it caused by a bug, or is there a reason for the execution plan – and if there is a reason, is there a way to identify the source and correct the problem without adding multiple hints to every SQL statement? In the chapter Jonathan offers a couple of clever, seemingly simple questions, for which most people will initially answer incorrectly – this is done as a demonstration of what potentially trips up the cost-based optimizer. He also demonstrates how the optimizer could become flustered by the way the index clustering factor is calculated, how both dynamic sampling and 11g’s extended column statistics may be ineffective in certain cases, and how the nightly automatic statistics gathering job could cause problems for partitioned tables. Also discussed is how a histogram’s impact on cardinality estimates changes when upgrading from Oracle 10.2.0.3 to 10.2.0.4 or 11g. After reading the chapter you will be aware that to the optimizer, all possible outcomes of an inequality totals 11% of all rows in a row source, where is that proof that 1=2 when you need it?
Chapter 12 (Final Version) – Riyaj Shamsudeen drives into the details of latch contention – this is not a chapter that should be read unless you are fully awake – I think that I might need to re-read the chapter a couple of more times to fully understand what was presented. The chapter begins with a brief description of latches and how latches differ from enqueues. Identifying and finding the cause for various types of latch contention is demonstrated through numerous SQL statements that hit Oracle’s various fixed tables and performance views, and he even shows a couple of examples of using TKPROF’s ORADEBUG to dump memory contents for analysis. The 0 key on the keyword might have been stuck when Riyaj described the SESSION_CACHED_CURSORS parameter.
Chapter 13 (Final Version) – Robyn Sands describes why it is important to measure twice, cut once, and then follow up with a re-measure every so often. Whether it is a needle in the haystack, a silver rock mixed in with a bucket of sand, or an occasional database reporting problem hiding among normal activity, it is hard to know if you found a needle, a silver rock, or a database performance problem unless you periodically capture the measurements of the items in front of you. After a brief introduction to what the expectations are of performance analysis, Robyn introduced topics that induced flashbacks of university-level linear algebra, statistics, and even political science while ripping through captured runtime performance statistics in search of the silver rocks (OK, I admit that they were red rocks) that indicate unexpected performance characteristics.
Chapter 14 (Final Version) – Pete Finnigan provides general information about securing database user accounts, stressing the need to eliminate those database user accounts that are installed by default but not used, with the goal of reducing the attack surface. Pete also introduces several of the scripts that he hosts on his website: use.sql (assess built-in and customer installed user account permissions), find_all_privs.sql (determine the assigned privileges of a user account), cracker-v2.0 (you better hope that access to SYS.USER$ is restricted), and profiles.sql (retrieve the security settings for the user profiles configured in the database). The author also provides several SQL statements throughout the chapter that may be used for analyzing various security settings. The detailed description of Oracle’s one way password hashing algorithm is interesting.
Chapter 15 (Final Version) – Pete Finnigan advises “learn to think like a hacker” in this chapter that describes the difficulties of securing data in the database. Pete introduces a couple more scripts from his website while peeling away the layers of security permissions that permit access to credit card (or other similar data) stored within the database. He introduces several sources for additional security permissions for seeing the “secured” data – you better hope that bind variables are in use, and that people do not know how to view the bind variables submitted by other sessions. The author mentions that database security extends to the whole server – don’t leave files containing unencrypted versions of the data stored in files in the filesystem of the server (and if you read chapter 14, you probably have already made up your mind to take down the post-it notes with the DBA passwords). Auditing is briefly mentioned.
January 4, 2010:
The Final Version of the Electronic Book may be downloaded from Apress Now – if you bought the “Alpha” copy of the book, you should be able to download the final version for free. The PDF is 593 pages including the front cover and index.
January 20, 2010:
I received notification that the printed version of the book shipped from Amazon on January 17, and should arrive by postal mail today.
January 23, 2010:
It appears that the first 100 pages of the book may be viewed on the Google Books site.
February 12, 2010:
Having now read through all of the chapters in the book, at the end of January the book authors discussed the description of the book, as currently listed on Amazon and the Apress site, and determined that it is inaccurate. First, the page count in not 400, but actually about 532 pages when excluding the index at the back of the book. Additionally, the bullet points under the heading “What you’ll learn” do not best describe what is in the printed book – those bullets are currently listed as follows:
- Eliminate guesswork by attacking database problems in a systematic manner.
- Manage transactional performance in SQL and PL/SQL.
- Secure your database from outside attack.
- Effectively deal with operating system platforms such as Windows and Linux.
- Diagnose database crashes.
- Choose the right block size to optimize I/O throughput.
I received notification today that Apress will be updating the description of the book contents based on feedback from the book authors. Expect to see the bullet points changed to something similar to the following, hopefully by Monday:
- Adopt a rational approach to database management
- Add value to your organization as a database professional
- Utilize effective techniques for performance management and optimization
- Effectively exploit different platform technologies
- Effectively secure your organization’s data
- Gain a deeper understanding of database internal features and structures
May 7, 2010
Review of the book appeared in the May 2010 edition of the NoCOUG Journal along with most of chapter 1 of the book.
December 28, 2010
It appears that the scripts for chapter 8 of the book are well hidden. The scripts may be downloaded from the Apress website. Inside the Expert_Oracle_Practices_Oracle_Database_Administration_from_the_Oak_Table-4524.zip download file is a folder titled ch08n09_UnderstandingOptMethods, and inside that folder is another folder titled ch09Scripts – this folder contains all of the scripts for chapters 8 and 9 (those scripts are listed at the end of chapter 8 in the book).
November 17, 2011
It appears that the script library file download for this book on the Apress website is corrupt. You can download the script library from here: Expert_Oracle_Practices_Scripts.zip (save with a .zip extension, and not a .doc extension).