March 3, 2012
Oracle Database Performance Tuning Test Cases without Many “Why”, “When”, and “How Much” Filler Details
I ordered the “Oracle Database 11gR2 Performance Cookbook” book shortly after it became available for purchase. I was very curious to see how the book compared with the similarly titled “Oracle Database 11g Performance Tuning Recipes” book, as well as some of the other Oracle Database performance books that are on the market. Packt is a fairly new book publisher, and this book marks the first Packt book in my collection.
The author of this book does not appear to be widely known in the international Oracle Database community, although it does appear that the author is an active reviewer of SQL Server and programming books on an Italian programming focused website. The author’s LinkedIn page indicates that he obtained OCA and OCP certification in 2002 and 2003, respectively, has a variety of programming experience, and currently is an IT Manager.
One important characteristic of this book that is missing from some of the other Oracle Database performance focused books on the market is the extensive use of test case scripts throughout most of the book that allow the reader to reproduce the performance changes mentioned in the book, in the reader’s Oracle Database environments. The test case scripts, related screen captures, and author’s explanations of the results are both a blessing and a curse for this book. It appears that the author used a single Amazon Elastic Compute Cloud hosted database instance with only one set of instance parameters and system statistics for the various test case results and the author’s descriptions of the expected outcome when the inputs in the test case script are provided. Had the author re-executed the test case scripts in another Oracle Database environment, the author probably would have written quite differently the explanations that follow the test case scripts. It is not uncommon for 80% of some of the book pages to be consumed by one or two SQL*Plus screen captures; combined with the slightly larger font sizes, double-spacing between paragraphs, and apparent one and a half spacing between lines in code sections, the technical content in the book is a bit more limited than the page count might suggest.
So, how well did the book’s contents meet the level of expectations provided by the book’s front cover and the publisher’s description of the book? One of the bullet pointed descriptions of the book reads, “Avoid common myths and pitfalls that slow down the database.” Unfortunately, the book reintroduces several myths and inaccurate conclusions about Oracle Database that have diminished in frequency during the last 10+ years. Some of the information in the book is of good quality. However, the significant number of inaccurate, vague, misleading, and/or over-generalized facts in this book suggests that the author of this book may have not received sufficient guidance from Packt and the four technical reviewers of the book. The book publisher’s site currently lists no errata for the book, even though I personally submitted 21 errata items to the publisher’s errata reporting site.
The author’s native language is obviously not English, so it is probably to be expected that some of the sentences in the book are incomprehensible. Yet, there are also sentences in the book that use completely different phrasing, close to that of a person who double-majored in English and computer science with a focus on Oracle Database. The consistent usage of the term “fields” in some sections of the book, with the consistent usage of the term “columns” in other sections of the book is but one example of the style shift that is present in the book. Some of the sentences found in the book are oddly familiar, and although I was not able to identify the original sources of all of the oddly familiar sentences, I did manage to locate a few. What constitutes plagiarism in an Oracle Database book, and how much change is required to the original material to avoid the plagiarism label? Would slightly reformatting a section of text to replace dashes with colons be sufficient to avoid the label? Would changing the order of some sentences and eliminating other sentences be sufficient to avoid the label? Would performing simple word substitutions here and there, or shortening sentences be sufficient to avoid the label? I am not suggesting that there is rampant plagiarism in the book, but one does need to question when that plateau is reached in a book about Oracle Database.
While in some respects this book is more useful to the reader than the “Oracle Database 11g Performance Tuning Recipes” book due to the inclusion of test cases, both books seem to omit the reasoning behind why and when someone might consider performing the 80 or so tasks/recipes mentioned in the books. Vague, inaccurate, over-generalized, and out of date descriptions of Oracle Database behavior are limiting factors of both books. This review is quite long, and likely will not appear in full on Amazon – see my blog for the full review.
Data Dictionary Views:
- DBA_VIEWS (page 20)
- V$FIXED_TABLE (page 21)
- V$LIBRARYCACHE (page 52)
- V$STATNAME, V$MYSTAT (page 53)
- SYS.SEQ$ (page 65)
- DBA_MVIEWS, USER_MVIEWS, ALL_MVIEWS (page 69)
- INDEX_STATS (pages 127, 128)
- V$SYSSTAT (page 160)
- V$SESSION (page 205)
- CURSOR_SHARING (pages 9, 38)
- TIMED_STATISTICS (pages 20, 201)
- LOG_CHECKPOINTS_TO_ALERT, BACKGROUND_DUMP_DEST (page 28)
- STATISTICS_LEVEL (pages 29, 32)
- CONTROL_MANAGEMENT_PACK_ACCESS (page 32)
- QUERY_REWRITE_ENABLED, QUERY_REWRITE_INTEGRITY (page 70)
- DB_16K_CACHE_SIZE (page 84)
- MAX_DUMP_FILE_SIZE, TRACEFILE_IDENTIFIER (page 201)
- SQL_TRACE (page 202)
- APPEND (page 72)
- INDEX (page 121)
Comments, Corrections, and Problems:
- The book states, “The first rule in writing applications which connect to an Oracle Database is to always use bind variables, which means not to include parameters in SQL statements as literals.” The statement should be clarified that this is a general recommendation. There are times when literals should be used rather than bind variables, for instance if there are very popular and unpopular values in a column, it might be wise to prevent the sharing of execution plans when a very popular or very unpopular value is used in the WHERE clause. A correction/clarification is provided on page 51 (page 8).
- Steps for creating a database with the Oracle Database Configuration Assistant seem to be out of place in a performance tuning book (pages 17-19)
- Uses the term “fields” where the term “columns” should be used (page 21).
- The book demonstrates the use of ANALYZE TABLE … COMPUTE STATISTICS, and DBMS_UTILITY.ANALYZE_SCHEMA to collect object statistics. The book states that ANALYZE is retained for backward compatibility, but the book provides no warning that using ANALYZE to collect statistics could be problematic since the release of Oracle Database 8.1 (reference page 21).
- The book uses the word “elaborate” rather than “create” or “generate” (pages 24, 26, 27, 31, 37)
- The book demonstrates the use of AWR without first mentioning the licensing requirements of that feature (pages 30-31).
- Word substitution error: “… and we experiment a lack of performance in another period, we can elaborate two reports…” (page 31)
- The book demonstrates the use of ADDM without first mentioning the licensing requirements of that feature. The book also states, “ADDM is enabled by default in Oracle Database 11g; it depends on two configuration parameters…” Unlike with Oracle Database 10.1 and 10.2, ADDM is not enabled by default in the Standard Edition of Oracle Database 11.1 or 11.2, nor can it be legally enabled on the Standard Edition. While ADDM is enabled by default in the Enterprise Edition 11.1 and 11.2, it cannot be legally used without a Diagnostic Pack license (pages 32-35).
- The book suggests the system-wide use of the deprecated SIMILAR value for the CURSOR_SHARING parameter as one of two solutions to address a hard parsing problem in a test case script (page 38).
- The book states, “Now the Soft Parse is 97.84 percent.” The output shown in the book actually indicates a Soft Parse percent of 99.20. The instance efficiency numbers in the output are identical to those found on page 40, so this might be an indication of a copy-paste error (page 39).
- The book states, “If the PreparedStatement is not closed, it can be executed multiple times – changing the value assigned to bind variables – and only a ‘light’ soft-parse will occur, with no syntax and semantic check.” If the SQL statement is held open – there will NOT be a “light” soft-parse (session cached cursors are not discussed in this section of the book, which would allow a “light” soft-parse if the cursor is NOT held open) (page 52).
- The elapsed time comparison between the directly executed SELECT statement, and the REFCURSOR that is returned by the SH.SALES_BY_PRODUCT procedure is not valid for a couple of reasons: 1) The script is executed by the internal user rather than a normal user, which can lead to unexpected performance differences; 2) The SELECT statement method displays its rows to the screen, so it is subject to delays caused by formatting the output for the SQL*Plus window (SET AUTOTRACE TRACEONLY STATISTICS may be used to reduce the impact of the formatting delays, but that change had little effect); 3) The REFCURSOR method, because it involves PL/SQL, will be subject to a context switch while the normal SELECT will not be subject to the context switch – the associated delay is operating system dependent and the timing should suggest that something is wrong with the test result; 4) While the normal SELECT statement test actually fetches the rows, the REFCURSOR method does not, as can be seen within an enabled 10046 trace (the normal SELECT will show a FETCH line that is preceded by WAIT lines, while the REFCURSOR method will not show a FETCH line in the trace file) (pages 54-55).
- The output of the Java version of the SQL*Plus test script found on pages 54-55 conflicts with the author’s intended result. Directly executing the SQL statement required 1.438 seconds, while using the REFCURSOR in the Java code required 1.722 seconds. The performance difference may be more significant than shown, because the direct execution of the SQL statement test was performed first, and the timing results include the time to flush the shared pool and the buffer cache (the first call will almost certainly take longer than the second call) (pages 56-58).
- The book uses a test case script to demonstrate the negative effects of using a “COUNTER” table rather than using a sequence to provide the same counter value. The test case script uses a trigger on the table to populate the counter column in the table, and the test case script does show that performance improves with the use of the Oracle sequence. The test case script, however, should have also included a test that completely eliminates the trigger on the table, populating the TRAVELID column by including TRAVEL_SEQ.NEXTVAL directly in the SQL statement that populates the table. My timing results show that the counter trigger-table method completes in 0.45 seconds, the trigger-sequence method completes in 0.14 seconds, and the select-sequence method completes in 0.03 seconds (reference pages 60-62).
- Accidental word substitution, “… and if the high watermark is reached, it caches other X numbers in the same manner.” “other” should be “another” (page 65).
- The author incorrectly read the AUTOTRACE generated execution plan. The book states “We can see that in the execution plan, there is full table access to the SALES table examining 918K rows and reading 8075 KB.” An AUTOTRACE generated execution plan shows an estimated execution plan that may differ from the actual execution plan in some situations, such as cases where bind variables are involved. Additionally, an AUTOTRACE generated execution plan shows the predicted number of rows that will be returned (not examined), and the predicted volume of data that will be returned (not read) based on the existing statistics for the objects (page 67).
- The book states, “However, from the execution plan, the number of rows processed is 72, and each row is 648 bytes long.” Once again it is important to stress that the execution plan is a predicted execution plan generated by AUTOTRACE. The estimated 72 rows returned by the operation in the execution plan does agree with the “72 rows processed” displayed in the actual statistics for the execution, but that will not always be the case for an AUTOTRACE generated execution plan (it happens to be the case because statistics were collected for the materialized view with a 100% sample rate). The statement that each row is 648 bytes long appears to be the result of misreading the previous execution plan, which estimated that 72 rows consuming 648 bytes total would be returned from operation 0 in the execution plan. The AUTOTRACE generated execution plan for the materialized view predicts that 72 rows consuming 1872 bytes will be returned from operation 0 in the execution plan, which shows a predicted row length of 1872/72 = 26 bytes per row (pages 67-68).
- The book states, “In the latter case [after flushing the buffer cache], we have 4047 consistent gets and 240 physical reads…” There are a couple of issues with this test case, found in the source code library file 2602_02_Materialized Views.sql. First, the script in the source code library uses “ANALYZE TABLE SH.MV_SALES_BY_PRODUCT COMPUTE STATISTICS” to collect the statistics on the materialized view, while the book shows the use of “EXEC DBMS_STATS.GATHER_TABLE_STATS” to collect the statistics – the collected statistics from the ANALYZE table command could very easily be different from the collected statistics from the DBMS_STATS.GATHER_TABLE_STATS command. The screen capture shown after flushing the buffer cache and re-executing the select from the materialized view does show 4,047 consistent gets and 240 physical block reads, as stated in the book, but it also shows 20,544 recursive calls where 0 recursive calls were shown prior to flushing the buffer cache – this recursive call count figure indicates that something else happened beyond the author flushing the buffer cache. My test results with just flushing the buffer cache show 8 consistent gets, 6 physical reads, and 0 recursive calls. The author also apparently flushed the shared pool, which triggered the recursive calls and the majority of the consistent gets and physical block reads (15,296, 2,978, and 177 respectively). The author probably should mention that the test case and advice will not work in a Standard Edition database, and should also state that the decision whether or not the materialized view is used is a cost-based optimizer decision (page 68).
- The book lists “QUERY REWRITE” as a required privilege to create materialized views. The Oracle Database 11.2 (and 10.1) documentation state that the QUERY REWRITE privilege is deprecated, and thus not needed (reference page 69).
- The book states, “The same parameters [QUERY_REWRITE_ENABLED, and QUERY_REWRITE_INTEGRITY] have to be enabled to use another functionality, function-based indexes.” QUERY_REWRITE_ENABLED must be set to TRUE in Oracle Database 9.2 to use function-based indexes, but that requirement disappeared in Oracle Database 10.1 (page 70).
- The book states, “We encounter row chaining when the size of the row data is larger than the size of the database block used to store it.” While this statement is correct, the book omits a secondary cause of chained rows – Oracle database supports a maximum of 255 columns in a row piece, so tables with more than 255 columns will necessarily have chained rows (page 84).
- The book casually demonstrates setting up a 16KB block size tablespace in a database that has a default 8KB block size. The book provides a list of several advantages for including smaller or larger than default block sizes in a single database including, “Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.” This justification is incorrect for several reasons including the fact that the DB_FILE_MULTIBLOCK_READ_COUNT parameter is scaled up for tablespaces that use a smaller than database default block size, and scales the parameter down for tablespaces that use a larger than database default block size. All of the justifications found on page 88 appear to be copied verbatim from a commercial website page. The book does not discuss the bugs and unexpected optimizer cost changes that might result from using multiple block sizes in a single database (reference reference2 pages 84-88).
- Step 5 contains two typos: using angle brackets (less than and greater than signs) rather than single quotes, and a spurious 3 after the semicolon (page 89).
- Step 7 and 9 contain typos: using angle brackets (less than and greater than signs) rather than single quotes (page 90).
- Steps 4 and 5 contain typos: using angle brackets (less than and greater than signs) rather than single quotes (page 97).
- Step 14 contains a corrupted SQL statement: “CREATE.5* FROM HR.BIG_ROWS WHERE 1=0;”. Steps 15, 16, and 19 contain typos: using angle brackets (less than and greater than signs) rather than single quotes. The author should have mentioned at least one of the possible problems with this approach, which might include triggers on the table, foreign keys that point to the table, and the potential statistics problems caused by the use of the ANALYZE TABLE command (page 92).
- The book states about the DBMS_SPACE.CREATE_TABLE_COST example, “In this procedure we have set the tablespace to use the average row size and the row count…” The purpose of this function is to estimate space usage, not to make changes to a tablespace (page 95).
- Step 1 contains an extraneous “.5” in the command.
- Pages 96-112 are present in the book, but omitted from this review.
- Steps 11 and 13 use angle brackets (less than and greater than signs) rather than single quotes (pages 116-117)
- The book states, “We can also create a function-based descending index.” This is a strange statement – all descending indexes in Oracle Database are function-based indexes (page 119).
- The book states, “… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query. We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.” In this case, the author is incorrectly attempting to generalize a special case into a general rule. Firstly, there is no myth to dispel – Oracle’s query optimizer has had the ability to use INDEX SKIP SCAN operations when the leading column of an index is not specified in the WHERE clause, since the release of Oracle Database 9.0.1 a decade ago – but that access path is usually only advisable when there are few distinct values in the leading column of the index. The author’s test case is a special case because all of the columns selected from the table are present in the index structure (page 119).
- The book states, “If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this.” The author then shows a SQL statement with the first column in the ORDER BY clause sorted in descending order and the second column in the ORDER BY clause sorted in ascending order. At this point in the book, the author has not yet stated that Oracle Database is able to read index entries in an ascending or descending order through a normal (ascending sorted) b*tree index, so this sentence in the book is confusing – almost to say that Oracle Database is not able to sort one column in ascending sequence and a second column in descending sequence – that concept is obviously false. It would have been more accurate for the book to state that, “Oracle Database is unable to _avoid_ a sort operation when accessing the rows through a concatenated index if both of the columns in the index are sorted in ascending sequence, the ORDER BY clause of the SQL statement specifies that one and only one column contained in the index should be ordered in descending sequence, and the second column in the concatenated index is included in the WHERE clause.” (page 120)
- A self-contradicting sentence, “In the first case, we have a full table scan, because we cannot retrieve all of the data from the index, so we have to do a TABLE ACCESS BY ROWID operation for each row, which satisfies the predicate.” Full table scan probably does not belong in that sentence (page 121).
- The book states, “In the next screenshot, we can see that Oracle knows (from the table statistics) that only 43 rows satisfy the where condition.” It is important to stress that the autotrace generated execution plan only shows the estimated number of rows that will be returned by an operation – the author’s query, in fact, retrieves a single row. The index that the author specified in the index hint was created on the columns CUST_LAST_NAME and CUST_YEAR_OF_BIRTH (in descending order), yet the author’s query only included the CUST_FIRST_NAME column in the WHERE clause – it is ridiculous to force the optimizer to use this index with a hint (page 121).
- The index’s clustering factor was not mentioned in the discussion of what determines the point at which it is more efficient to access a table through an index access path, rather than a full table scan – only the average row length was described as a consideration and the percentage of the rows that need to be retrieved. It could very well be the case that with a very poor clustering factor, that it is more efficient to retrieve less than 1% of the table’s rows through a full table scan, rather than an index lookup (page 122).
- The book should define “intra-block fragmentation” which is the benefit that the book lists as resulting from rebuilding indexes (page 123).
- The two session example of one session rebuilding an index while a second session executes a SELECT and INSERT seems to be pointless. The second session does not use the index that the first session attempts to rebuild, instead a full table scan is performed on the BIG_CUSTOMERS table, followed by an index unique scan of the CUSTOMERS_PK index. An index named IX1_BIG_CUSTOMERS was created in the script, yet the script attempts to rebuild a non-existent index named IX1_MYCUSTOMERS. The test case only shows an example of efficiency gains due to blocks being buffered in the buffer cache. The book should have mentioned that an online rebuild and parallel rebuild are only possible in the Enterprise Edition of Oracle Database (pages 123-125).
- Step 10 uses angle brackets (less than and greater than signs) rather than single quotes (page 126).
- The book states, “We have used the PARALLEL option too, to speed up the rebuild process.” While specifying PARALLEL during an index rebuild may speed up the rebuild, it is important to note that this results in an index with a parallel degree that should be manually reset to the original value, once the rebuild completed (page 127).
- The book states, “However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.” The book should have defined what is meant by “skewed index structure” – does the book mean, for instance, that one portion of the index could have a BLEVEL of 2 while another portion of the index could have a BLEVEL of 3 – if that is the case, the book’s statement is incorrect. If the book’s definition of “skewed index structure” is that some leaf blocks of the index will be more densely packed than other leaf blocks in the same index structure, then that should be considered normal behavior for Oracle indexes – an occasional coalesce might be used to combine index entries in logically adjacent leaf blocks, but scheduling index rebuilds is neither required, nor recommended. Depending on the order of the inserted values in relation to the order of the entries in the index leaf blocks, an index leaf block split operation could evenly divide the existing index entries between two leaf blocks (a 50-50 split, resulting in both index blocks being 50% utilized, if the inserted value is not the highest value that would be inserted into the leaf block), or all of the existing entries will remain in the existing leaf block and the new entry will be placed by itself into a new leaf block (a 90-10 split). A deleted index entry will remain in the block at least until that transaction is committed, but any post-transaction insert into the block will clear out all deleted index entries in the block. Deleting all table rows with index entries at the low end of the index (the values were populated by a sequence, for example, and are deleted in the same sequential order) could leave many blocks in the index structure with nothing but deleted index entries, but that situation should only result in a performance problem if SQL statements attempt to determine the minimum value for the indexed column, or to some extent, fast full index scans and full index scans (reference reference2 page 127).
- The book states, “If the value for DEL_LF_ROWS/LF_ROWS is greater than 2, or LF_ROWS is lower than LF_BLKS, or HEIGHT is 4 then the index should be rebuilt.” Some of the advice found on the Internet suggests that if DEL_LF_ROWS is 20% of LF_ROWS, then the index should be rebuilt – did the author of this book intend to write “If the value for DEL_LF_ROWS/LF_ROWS is greater than 0.2”? Why should the result of DEL_LF_ROWS/LF_ROWS be a consideration of whether or not an index should be rebuilt – is it supposed to measure the amount of wasted/unused space in the index leaf blocks? The next INSERT/UPDATE DML operation in a given leaf block will clear out the index rows that are flagged as deleted, but then does that imply that the space is not wasted (or is the space wasted)? What if there are many index blocks that are roughly 50% utilized due to a large number of 50-50 leaf block splits, is that space not wasted (or is the space wasted)? Since the formula DEL_LF_ROWS/LF_ROWS really does not describe the percent of used space in the index, it is probably best to just ignore the result of that formula. DEL_LF_ROWS/LF_ROWS can never be greater than 1 because the statistic found in the LF_ROWS column includes the DEL_LF_ROWS statistic. The second criteria suggests comparing LF_ROWS to LF_BLKS, such that if on average there is less than one index entry per leaf block, that the index should be rebuilt – there can never be less than one index entry per leaf block, because the leaf block will be detached from the index structure when all rows are removed from that leaf block. The final criteria suggests rebuilding the index when the height is exactly 4 – does that mean that an index with a height of 5, 6, 7, etc. does not need to be rebuilt? What if after rebuilding the index it still has a height of 4 – will it help to rebuild a second time? (page 127)
- The book states, “When we rebuild an index, we can add the COMPUTE STATISTICS option to that statement.” Since the release of Oracle Database 10.1, statistics are automatically collected when indexes are created and/or rebuilt, so the COMPUTE STATISTICS clause is unnecessary (page 127).
- Steps 6 and 9 uses angle brackets (less than and greater than signs) rather than single quotes (page 128-129).
- Steps 8 and 15 uses angle brackets (less than and greater than signs) rather than single quotes (page 131-132).
- The book should mention that bitmap indexes are not available in the Standard Edition of Oracle Database (page 136).
- Step 3 uses angle brackets (less than and greater than signs) rather than single quotes (page 137).
- The author created a composite bitmap index with three columns to demonstrate the use of bitmap indexes. Composite bitmap indexes are rare – one of the strengths in using bitmap indexes is the ability to create multiple single column bitmap indexes, and as needed the optimizer will select to bitmap join two or more bitmap indexes in an attempt to significantly reduce the number of rows visited in the table (page 138).
- The book states, “This time the execution plan uses the newly created bitmap index, … using the INDEX RANGE SCAN or INDEX FAST FULL SCAN operation, depending on whether we are filtering on the first key column of the index – CUST_GENDER – or not. This result is obtained thanks to the structure of bitmap indexes.” With the index definition found in the book, the operations that should be present in the execution plan are BITMAP INDEX RANGE SCAN and BITMAP INDEX FAST FULL SCAN, while you might expect to find INDEX RANGE SCAN or INDEX FAST FULL SCAN operations associated with normal b*tree indexes. However, it is a cost-based decision for the optimizer to use or not use an index, so there is no guarantee that index will be used as indicated in the book if the leading column in the index is either specified or not specified. Additionally, it is not the structure of bitmap indexes that permits INDEX RANGE SCAN or INDEX FAST FULL SCAN operation, depending on whether we are filtering on the first key column of the index – creating a normal b*tree index in the script rather than a composite bitmap index could (will) actually allow the optimizer to take advantage of INDEX RANGE SCAN or INDEX FAST FULL SCAN operations (page 139).
- The book states, “Bitmap indexes offer very fast performance when we have a low cardinality field indexed on a table containing many rows.” This statement could have several different interpretations, but I believe that the author’s intended meaning is “Bitmap indexes offer significantly faster performance than b*tree indexes when columns with few distinct values are indexed in tables containing a significant number of rows.” This fixed statement still requires additional clarification – if the bitmap index does not help to further reduce the number of table rows that are accessed through the index, the end result may be performance that is roughly the same as that of an equivalent b*tree index. One way to accomplish the task of further reducing the number of table rows accessed is through the utilization of multiple bitmap indexes with bitmap combine operations to significantly reduce the number of rowids that are used to fetch table rows (page 139).
- The book states, “When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.” This statement requires a bit of clarification. I do not believe that the author is stating that updating an entry in a bitmap index will lock all of the bitmap indexes in the database (a segment could be a table, table partition, index, etc.). Instead, I think that the author is intending to state that updating an entry in a bitmap index will lock all of the index entries in that index, effectively preventing any other session from inserting, updating (the column covered by the index), or deleting rows in the table. For very small bitmap indexes, this statement could very well be true. However, for larger bitmap indexes, built for tables with many rows, the number of index rows that will be locked during an update is determined by the number of rows covered by the index block(s) that update changed, possibly 20,000 to 50,000 rows per index block. (reference slide 46, reference2 page 2, reference3 comments section; page 139).
- The book states, “This [bitmap join index] is a bitmap index which represents the join between two tables, and can be used instead of a materialized view in certain conditions.” The book did not offer any suggestions or describe any conditions that permit a bitmap join index to take the place of a materialized view. The statement in the book needs additional clarification (reference reference2 page 140).
- The book states about index organized tables, “If the row size exceeds the size indicated by this parameter [PCTTHRESHOLD], the fields not indicated by the INCLUDING option are stored in the OVERFLOW – if indicated, otherwise the row is not accepted.” This is a confusing sentence – it is not clear what the author is attempting to state. The Oracle documentation states, “In addition to specifying PCTTHRESHOLD, you can use the INCLUDING clause to control which nonkey columns are stored with the key columns. The database accommodates all nonkey columns up to and including the column specified in the INCLUDING clause in the index leaf block, provided it does not exceed the specified threshold. All nonkey columns beyond the column specified in the INCLUDING clause are stored in the overflow segment. If the INCLUDING and PCTTHRESHOLD clauses conflict, PCTTHRESHOLD takes precedence.” (reference page 146).
- The book demonstrates partitioning without mentioning that partitioning is an extra cost option that may only be purchased for the Enterprise Edition (page 146).
- The book states, “To obtain the best uniform data distribution, it’s better to choose a number of partitions which is a power of 2, having a unique or near partition key.” The Oracle Database documentation states, “For optimal data distribution, the following requirements should be satisfied: Choose a column or combination of columns that is unique or almost unique. Create multiple partitions and subpartitions for each partition that is a power of two.” It appears that the author of the book incorrectly stated the first requirement that is mentioned in the documentation (reference page 149).
- The script is run by the SYS user rather than a normal user, which can lead to unexpected performance differences (page 157).
- The “ALTER TABLE SH.MY_SALES_2 ENABLE ROW MOVEMENT” and “SHRINK SPACE” commands are only applicable if the MY_SALES_2 table is stored in an ASSM tablespace – the book did not mention that limitation (page 165).
- The book states, “If we do an FTS [full table scan], database buffers are used to read all the table data, and this situation may lead to flushing the buffer cache data to make room for the FTS data. To avoid this situation and to limit the consequences on the database buffer cache, the database blocks from FTS operations are put on the top of the LRU (Least Recently Used) list.” This statement requires significant adjustment before it is an accurate statement. Is the author describing the behavior of Oracle Database 8.1 or Oracle Database 11.2 as indicated on the front cover of the book? What is meant by the “top of the LRU list” – is that the MRU (most recently used) end? If the author meant that the blocks were placed on the least recently used end of the LRU list, then the author agrees with the Oracle Database 10.2 documentation, but that documentation is incorrect (the behavior changed around the time 9.0.1 was released). The Oracle Database 11.2 documentation states that blocks read by full table scan are placed at the mid-point of the LRU list (if the CACHE keyword is specified when the table is created or altered, then the table blocks will be placed on the MRU end of the list). Since the book is specifically about Oracle Database 11.2, it is worth pointing out that since the release of Oracle Database 11.1, the Oracle RDBMS often makes use of serial direct path read when performing full table scans, and that type of access completely avoids reading the table blocks into the buffer cache (the blocks are read into the PGA). Oracle event 10949 may be used to disable serial direct path read. What about parallel full table scans of larger tables? Those too will avoid flooding the buffer cache with blocks that may only be accessed once. Smaller tables will certainly have their blocks read into the buffer cache, but the touch count associated with each of the table’s blocks will limit the problems that those blocks will cause in the buffer cache (reference page 170).
- The book states, “This is because when we have a larger database block, we can read many rows in a block and even subsequent database blocks – in one operation – by setting the parameter DB_FILE_MULTIBLOCK_READ_COUNT (at the instance or session level).” While the book does indicate that the maximum number of bytes read is operating system dependent, the book provides no hint regarding the Oracle limit for the parameter, or anything else that might cause fewer blocks to be read in a single read request (extent size, pre-existing blocks already in the buffer cache, etc.) Since this book is about Oracle Database 11.2, it is worth mentioning that as of Oracle Database 10.2 the Oracle RDBMS will automatically derive a value for the DB_FILE_MULTIBLOCK_READ_COUNT parameter based on the SGA size and the value of the SESSIONS parameter – so nothing actually needs to be set to take advantage of multi-block reads. Stating the obvious, but the DB_FILE_MULTIBLOCK_READ_COUNT parameter has no influence over Oracle reading multiple rows from the same table block (page 171).
- The book states, “The use of this parameter [ DB_FILE_MULTIBLOCK_READ_COUNT] influences even the optimizer – if it’s less expensive to read all the rows in a table than using an index, the optimizer will use an FTS even if there are usable indexes in place.” The accuracy of this statement is Oracle Database version dependent (prior to Oracle Database 9.0.1 the statement was true, as of Oracle Database 9.0.1 the statement is false when workload system statistics have been collected (reference page 171).
- The recipe titled “Avoiding full table scans” showed how to trigger full table scans, but did not show how to avoid full table scans, nor did it provide any advice about when full table scans should be avoided (pages 164-172).
- The book states, “The effectiveness of an index depends on the number of rows selected out of the total number of rows in the table… In the real world, an index with a selectivity of less than 10 percent [of the table’s rows] is considered suitable enough [for the index to be used].” The 10% figure is a very rough figure – the suitability of an index is dependent on many items beyond the percentage of rows that will be selected. Quoting from the documentation, “The cost of an index scan depends on the levels in the B-tree, the number of index leaf blocks to be scanned, and the number of rows to be fetched using the rowid in the index keys. The cost of fetching rows using rowids depends on the index clustering factor.” “Therefore, the optimizer’s decision to use full table scans is influenced by the percentage of blocks accessed, not rows. This is called the index clustering factor. If blocks contain single rows, then rows accessed and blocks accessed are the same.” The calculated cost determines whether or not an index will be used – the cost for an index range scan is index blevel + ceil(index selectivity * leaf blocks) + ceil(table selectivity * clustering factor). While the optimizer’s calculated cost for a particular index access path may not accurately represent the effectiveness of the index, the cost is the final deciding factor for the optimizer when determining the effectiveness of the index for a specific SQL statement (reference reference2 reference3 page 176).
- The book states, “The answer is simple – indexes cannot be used when we compare values with a not equal operator.” To clarify, the book is describing a situation involving a single table, when the only predicate on the indexed column is the inequality comparison. Adding an INDEX hint to the author’s sample SQL statement results in an INDEX FULL SCAN operation – the hint allows the SQL statement containing the not equal predicate to use the MY_CUSTOMERS_IXVALID index. It is not the case that the index cannot be used with an inequality comparison, however, the query optimizer does not automatically consider an access path involving the index due to the general assumption that many rows will be returned when all except one value is selected. To avoid the INDEX FULL SCAN operation, where the index structure is read one block at a time, the inequality could be converted to a less than predicate and a greater than predicate with a logical OR between the two predicates (reference page 176).
- The book states, “NULL values are not stored in indexes, so when we query for records with a NULL value in field X, even if the X column is indexed, the index will not be used.” The book’s description is incomplete. NULL values are not stored in single column b*tree indexes. There are at least four methods to work around this issue and allow indexes to be used to identify rows with a NULL value in the indexed column: 1) Define a composite index with at least one other column that has a NOT NULL constraint – ideally, the column in which the NULL values might appear would be the leading column in the composite index. 2) Define a composite index with a numeric constant (such as 1) as the second column in the composite index. 3) Bitmap indexes always store NULL values – if appropriate (column experiences few updates, deletes, inserts, and an Enterprise Edition database), create a bitmap index for the column. 4) If the number of NULL values in a column will be relatively small, and the original SQL statement may be modified, create a function based index that converts NULL values to 1 and non-NULL values to NULL: DECODE(C3,NULL,1), or (CASE WHEN C3 IS NULL THEN 1 ELSE NULL END), or (NVL2(C3,NULL,1)). The DECODE syntax is subject to NLS related translations – be certain to check the view USER_IND_EXPRESSIONS for the index to determine the syntax required in the WHERE clause of the SQL statements to take advantage of the function based index (page 177).
- The book states, “If direct path load would be the fastest method to insert data in tables, without constraints, the optimizer would use it by default.” This sentence does not make sense. Is the word “constraints” in this sentence to be read as a table without primary key/foreign key/unique key constraints, or is “constraints” to be read as serialized operations/space usage “limitations” of direct path insert operations – it is these limitations that allow the direct path inserts to function without corrupting tables and indexes (page 185).
- In step 5, the book states, “Rewrite the query using the NOT IN construct:”. The example shown in the book shows a non-correlated (invalid) NOT EXISTS SQL statement rather than the NOT IN construction that is found in the source code library (page 193).
- The book states, “Even in the previous case we can see the substantial equivalence of the NOT IN and NOT EXISTS operations, related to gets/reads.” One of the problems with this test case is that the output is not showing the equivalence of the NOT IN and NOT EXISTS forms of the SQL statement – there can be a difference between the two when NULL values enter the picture (that possibility is described in pages 199-200). The issue with the test case is that the (11.2) Oracle query optimizer has transformed (as seen in a 10053 trace) both the NOT IN and NOT EXISTS queries into the following query that uses a regular join “SELECT S.AMOUNT_SOLD AMOUNT_SOLD FROM SH.CUSTOMERS C,SH.SALES S WHERE S.CUST_ID=C.CUST_ID AND (C.CUST_CREDIT_LIMIT=10000 OR C.CUST_CREDIT_LIMIT=11000 OR C.CUST_CREDIT_LIMIT=15000)” (page 197).
- When describing the output of the ANSI left outer join version of the NOT IN/NOT EXISTS SQL statement, the book states, “Even if we have the same statistics and almost the same execution plan, the meaning of the last query isn’t as intuitive as in the previous case… and we have seen that there is no performance improvement (or detriment) in doing so.” The problem here is that the (11.2) Oracle query optimizer has transformed the ANSI left outer join SQL statement into a SQL statement that is very similar to the transformed NOT IN/NOT EXISTS version of the query. In older Oracle Database versions, where these automatic transformations would not take place, the left outer join syntax is often much faster than the equivalent NOT IN or NOT EXISTS syntax. The (11.2) optimizer transformed version of the SQL statement follows, “SELECT S.AMOUNT_SOLD AMOUNT_SOLD FROM SH.SALES S,SH.CUSTOMERS C WHERE C.CUST_ID IS NULL AND S.CUST_ID=C.CUST_ID(+) AND (C.CUST_CREDIT_LIMIT(+)=10000 OR C.CUST_CREDIT_LIMIT(+)=11000 OR C.CUST_CREDIT_LIMIT(+)=15000)” (page 198).
- The book states, “We have to set the destination for our trace files also. When using dedicated servers, the parameter is USER_DUMP_DEST. In the multi-threaded server environment the parameter is BACKGROUND_DUMP_DEST…” This recipe is oddly reminiscent of pages 452 through 458 of the book “Expert One-On-One Oracle”, and seems to be more applicable to Oracle Database 8.1 than to Oracle Database 11.2. In Oracle Database 11.1 and 11.2, by default the USER_DUMP_DEST and BACKGROUND_DUMP_DEST parameters both point at the same directory named “trace”. The TIMED_STATISTICS parameter defaults to TRUE because the STATISTICS_LEVEL parameter defaults to TYPICAL in Oracle Database 10.1 and above, so it is not necessary to modify the TIMED_STATISTICS parameter. The use of “ALTER SESSION SET SQL_TRACE=TRUE;”, as shown in the recipe, is deprecated as of Oracle Database 10.2 (see Metalink Doc ID 30820.1). “Multi-threaded server” was renamed to “shared server” with the release of Oracle Database 9.0.1. (pages 201-205)
- Appendix A and Appendix B include a somewhat random sampling of various Oracle Database performance views and database packages. The depth of coverage of the various views and packages rarely extends beyond a brief summary of a couple of view columns or procedures in the package. Think of these appendixes as a condensed and reformatted version of the Oracle Database documentation book contents.
- The descriptions of all of the columns in the V$SESSION dynamic performance view are shortened verbatim copies or slightly reworded copies of the descriptions found in the Oracle Database 11.2 Reference book in the Oracle Database documentation library – below are comments about three randomly selected entries in the appendixes (page 490).
- The descriptions of all of the procedures listed for the DBMS_OUTLN package are copied verbatim from the descriptions found in the Oracle Database 11.2 Supplied PL/SQL Packages book in the Oracle Database documentation library (page 505).
- The descriptions of the REFRESH_PRIVATE_OUTLINES and DROP_EDIT_TABLES procedures of the DBMS_OUTLN_EDIT package are verbatim copies of the descriptions found in the Oracle Database 9.2 Supplied PL/SQL Packages book in the Oracle Database documentation library (page 504).
- The descriptions of the LOAD_PLANS_FROM_CURSOR_CACHE, LOAD_PLANS_FROM_SQLSET, and DROP_SQL_PLAN_BASELINE procedures of the DBMS_SPM package are verbatim copies of the descriptions found in the Oracle Database 11.2 Supplied PL/SQL Packages book in the Oracle Database documentation library (page 505).
- Seven steps for solving a performance problem – might lead to compulsive tuning disorder if followed to the letter, but still a good general practice (page 12).
- Book seems to use a lot of forward and backward references.
- Good (simple) example of SQL injection risks (pages 161-163).
Part 2 of this book review is expected to cover the second half of the book, including the freely downloadable chapter 10.
Blog articles that reference the “Oracle Database 11gR2 Performance Tuning Cookbook” book: