Histograms – What is Wrong with this Quote?

4 07 2012

July 4, 2012

It has been several months since I posted my review of the first half of the “Oracle Database 11gR2 Performance Tuning Cookbook” book.  I had planned to post the review for the second half of the book a couple of weeks later, however four months have passed by since that time, and the second half of the review has yet to materialize.  Someone recently left the following comment attached to one of my reviews on Amazon:

“This is the most comprehensive feedback I have ever read!!”

I appeciate the comment.  Keep in mind that particular review was only roughly 9.5 typewritten pages in length; it takes a lot of time to put together an 18, 24, or 35 typewritten page review that is just as comprehensive as the one that elicited the above comment.

On to the topic of this article…

While reading the “Oracle Database 11gR2 Performance Tuning Cookbook” book, I noticed an interesting set of statements about the use of histograms (bolded sections appeared bolded in the book):

“Histograms are useful only on indexed columns containing skewed values, because they help the optimizer to choose whether to use the index or not to access values.  Obviously, if the frequency for a specific value is very high, using the index won’t be the best choice….”

“Don’t use histograms in situations where:

  • The column is not used in the WHERE clauses of the queries
  • The data in the column is uniformly distributed
  • Bind variables are used when comparing against the column”

What, if anything, is wrong with the above quote from the book?  If possible, provide test cases to support or refute the above quote from the book (see the tips at the bottom of the blue section at the right of this blog page regarding how to post code sections).

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.





Temporary Tablespace Storage Parameters – What is Wrong with this Quote

22 04 2012

April 22, 2012 (Modified April 24, 2012)

I had intended to finish assembling the second half of the “Oracle Database 11gR2 Performance Tuning Cookbook” book review, however my free time that may be dedicated to book reviews has been a bit limited lately (I have a review of another book started, but left untouched for the last three months).  On a side note, I am a little shocked that none of the 21 errata reports that I filed for the first 88 pages of the book made it onto the publisher’s website, but maybe that is not too uncommon (the same has happened for at least one Apress book). 

I encountered an interesting quote on page 252 of the “Oracle Database 11gR2 Performance Tuning Cookbook” that I thought I would share with readers:

“The PCTINCREASE parameter cannot be specified when we use the AUTOALLOCATE option for the tablespace. We can have different tablespaces for each database user and we can query the dynamic performance view V$SORT_USAGE to retrieve the details about the active disk sorts occurring in the instance.

We can obtain better performance by striping the temporary tablespace using multiple disks. Please note that – due to their nature – temporary tablespaces are not affected by backup and restore operations.”

If you own the book, start on page 251 and read through the end of page 252. – there are a couple of additional interesting sentences in this section of the book 

What, if anything, is wrong with the above quote from the book? For those readers that need some assistance with this task, the Oracle Database 11.2 documentation might be helpful.

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.

(Added April 24, 2012, my book review notes for the four quoted sentences, plus one of the preceding sentences)

 

  • The book’s discussion of defining the INITIAL and NEXT storage parameters for temporary tablespaces as a multiple of the SORT_AREA_SIZE parameter seems to be out of place in an Oracle Database 11g R2 performance tuning book – dictionary managed tablespaces were deprecated with the release of Oracle Database 9.2, and the effect of these parameters is different in locally managed tablespaces (page 252).
  • The book states, “The PCTINCREASE parameter cannot be specified when we use the AUTOALLOCATE option for the tablespace.”  This statement appears to be incorrect – the Oracle Database documentation states that a specified value for PCTINCREASE will be ignored when creating an AUTOALLOCATE tablespace (page 252).
  • The book states, “We can have different tablespaces for each user and we can query the dynamic performance view V$SORT_USAGE to retrieve the details about the active disk sorts occurring in the instance.”  This statement is a bit confusing, so it is probably best to break the sentence into two separate logical sentences for analysis.  The first half of the statement seems to suggest that a separate temp tablespace should (or could) be created for each user – I am not sure that this is the author’s intended interpretation; the default temporary tablespace may be set at the user level so that not all users are required to use the same (default) temporary tablespace.  For the second logical sentence, the V$SORT_USAGE performance view was deprecated with the release of Oracle Database 9.2, replaced with V$TEMPSEG_USAGE.  In Oracle Database 11.2 (the version mentioned on the book’s front cover), the V$SORT_USAGE view is based on the GV$SORT_USAGE view which selects from X$KTSSO (confirm by issuing SELECT VIEW_DEFINITION FROM V$FIXED_VIEW_DEFINITION WHERE VIEW_NAME=’GV$SORT_USAGE';).  The GV$SORT_USAGE  view definition, which is pointed to by both the V$SORT_USAGE and V$TEMPSEG_USAGE synonyms, indicates that the SEGTYPE column values may be one of SORT, HASH, DATA, INDEX, LOB_ DATA, LOB_INDEX , or UNDEFINED.  The potential values of the SEGTYPE column suggests that the view potentially shows a variety of activity in addition to what the book mentions (the WHERE clause should be adapted to restrict the rows returned from this view, so that the results are consistent with this chapter’s contents).  The V$SORT_USAGE view also shows entries for aborted SQL statements, for instance when Ctrl-C is pressed in SQL*Plus and another SQL statement has not been executed by the session (page 252).
  • The book states, “Please note that – due to their nature – temporary tablespaces are not affected by backup and restore operations.”  This is a vague statement at best, and the accuracy of the statement depends in part on the type of backup performed (online or offline, user-managed or RMAN hot backup, etc.).  An online RMAN backup will not back up the tempfiles of locally managed temporary tablespaces; however, when those backed up databases are restored and recovered, Oracle Database 10.2 and later will recreate the temporary tablespace tempfiles when the database is opened (page 252).




Repeat After Me: NULL Values are Not Stored in Indexes?

28 02 2012

February 28, 2012

I do not always get as much benefit from the books that I read as the books’ authors probably intended, although the contents of books, whether right or wrong, sometimes help me remember nearly forgotten facts.  Some of the books, for instance “Pro Oracle SQL” describe how to use b*tree indexes to locate rows with NULL values in a column.  The book “Expert Oracle Database Architecture: Oracle Database Programming 9i, 10g, and 11g Techniques and Solutions, Second Edition” also shows several techniques for utilizing b*tree indexes to locate rows with NULL values in a column.  Richard Foote’s blog also has at least one article that describes how to use b*tree indexes to locate NULL values in a column.

So, what headed me down the path of NULL Values are Not Stored in Indexes?  I read the following sentence in the book “Oracle Database 11gR2 Performance Tuning Cookbook” on page 177:

“NULL values are not stored in indexes, so when we query for records with a NULL value in field X, even if the X column is indexed, the index will not be used.”

I can’t help but feel that I have seen very similar statements in the past.  The Oracle Database documentation, at least from 8.1 through 11.1 included the following quote:

“The query must be guaranteed not to need any NULL values from the indexed expression, because NULL values are not stored in indexes.”

But where else have I seen that message? A Google search for: oracle NULL values are not stored in indexes found a couple of places where the message is repeated.

Oracle Database 11g: The Complete Reference” (I read the 9i version of this book years ago):

NULL values are not stored in indexes. Therefore, the following query will not use an index; there is no way the index could help to resolve the query:”

select Title
  from BOOKSHELF
 where CategoryName is null;

Oracle 9i Performance Tuning Tips & Techniques” page 39 (I quickly paged through the 10g book in a book store once, and read the sample chapter on Statspack reports):

“Using IS NULL or IS NOT NULL will also suppress index use because the value of NULL is undefined.”

Oracle Data Warehouse Tuning for 10g” page 51 (I have not had the opportunity to read this book):

“For a BTree index, NULL values are not included in the BTree structure and, thus, not even accessible through the index.”

Expert Indexing in Oracle Database 11g” page 159 (any problems here – I think that I have a test case somewhere that suggests that the second column should be a number, I thought about buying this book):

“If all index columns are NULL, Oracle Database doesn’t include rows into an index. However, you can actually index NULL values by simply adding another column to the index, like so:

SQL> create index with_null on employees(nullable_column, '1');

So, what is my review comment for the Cookbook?

The book states, “NULL values are not stored in indexes, so when we query for records with a NULL value in field X, even if the X column is indexed, the index will not be used.”  The book’s description is incomplete.  NULL values are not stored in single column b*tree indexes.  There are at least four methods to work around this issue and allow indexes to be used to identify rows with a NULL value in the indexed column:

  1. Define a composite index with at least one other column that has a NOT NULL constraint – ideally, the column in which the NULL values might appear would be the leading column in the composite index.
  2. Define a composite index with a numeric constant (such as 1) as the second column in the composite index.
  3. Bitmap indexes always store NULL values – if appropriate (column experiences few updates, deletes, inserts, and an Enterprise Edition database), create a bitmap index for the column.
  4. If the number of NULL values in a column will be relatively small (compared to the number of rows in the table), and the original SQL statement may be modified, create a function based index that converts NULL values to 1 and non-NULL values to NULL:
    • DECODE(C3,NULL,1)
    • (CASE WHEN C3 IS NULL THEN 1 ELSE NULL END)
    • (NVL2(C3,NULL,1))

Let’s build a little test case to demonstrate.  First, a table is created with 1,000,000 rows, two indexes are created, and then statistics are gathered with histograms generated for all indexed columns:

CREATE TABLE T2 (
  C1 NUMBER NOT NULL,
  C2 VARCHAR2(15) NOT NULL,
  C3 DATE,
  C4 VARCHAR2(255));

INSERT INTO T2
SELECT
  ROWNUM C1,
  RPAD(CHR(MOD(ROWNUM-1,26)+65),15,'0') C2,
  DECODE(MOD(ROWNUM,100),0,NULL,SYSDATE+ROWNUM/10000) C3,
  RPAD('A',255,'A') C4
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

CREATE INDEX IND_T2_C3_C2 ON T2(C3,C2);
CREATE INDEX IND_T2_C3_C ON T2(C3,1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL INDEXED COLUMNS SIZE 254') 

In the above table, 1% of the rows will have a NULL value in column C3.

Let’s try the first test to see if one of the above two indexes may be used to locate the NULL values in column C3:

SELECT
  C1,
  C2,
  C3
FROM
  T2
WHERE
  C2='D00000000000000'
  AND C3 IS NULL;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  3sjqqpjradbgz, child number 0
-------------------------------------
SELECT   C1,   C2,   C3 FROM   T2 WHERE   C2='D00000000000000'   AND C3
IS NULL

Plan hash value: 895813321

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       |   389 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2           |   385 | 14245 |   389   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T2_C3_C2 |   385 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C3" IS NULL AND "C2"='D00000000000000')
       filter("C2"='D00000000000000') 

Well, it appears that the composite index on columns C3 and C2 might have helped quickly locate the rows with NULL values in column C3 (we just tested point #1 above).  Let’s drop that index and try again:

DROP INDEX IND_T2_C3_C2;

SELECT
  C1,
  C2,
  C3
FROM
  T2
WHERE
  C2='D00000000000000'
  AND C3 IS NULL;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  3sjqqpjradbgz, child number 0
-------------------------------------
SELECT   C1,   C2,   C3 FROM   T2 WHERE   C2='D00000000000000'   AND C3
IS NULL

Plan hash value: 1053304445

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |       |       |   550 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T2          |   385 | 14245 |   550   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T2_C3_C | 10000 |       |    33   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='D00000000000000')
   2 - access("C3" IS NULL) 

This time we cannot simply state that “NULL Values are Not Stored in Indexes”.  The Predicate Information section of the execution plan shows access(“C3″ IS NULL) (we just tested point #2 above).

Let’s drop the index and try something else:

DROP INDEX IND_T2_C3_C;

CREATE BITMAP INDEX IND_T2_C3_BIN ON T2(C3);

SELECT
  C1,
  C2,
  C3
FROM
  T2
WHERE
  C2='D00000000000000'
  AND C3 IS NULL;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  3sjqqpjradbgz, child number 0
-------------------------------------
SELECT   C1,   C2,   C3 FROM   T2 WHERE   C2='D00000000000000'   AND C3
IS NULL

Plan hash value: 1153509852

----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |       |       |  1954 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID | T2            |   385 | 14245 |  1954   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|               |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | IND_T2_C3_BIN |       |       |            |          |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='D00000000000000')
   3 - access("C3" IS NULL) 

Once again, we found that we cannot simply state that “NULL Values are Not Stored in Indexes”.  The Predicate Information section of the execution plan shows access(“C3″ IS NULL) (we just tested point #3 above).

Let’s drop the index and try just indexing the NULLs (sub-point 1 of point #4 above):

DROP INDEX IND_T2_C3_BIN;

CREATE INDEX IND_T2_C3_FN ON T2 DECODE(C3,NULL,1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL HIDDEN COLUMNS SIZE 254 FOR ALL INDEXED COLUMNS SIZE 254',NO_INVALIDATE=>FALSE)

SELECT /*+ INDEX(T2) */
  C1,
  C2,
  C3,
  NVL2(C3,NULL,1) C4
FROM
  T2
WHERE
  C2='D00000000000000'
  AND DECODE(C3,NULL,1)=1;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  8psj7gcwcn72m, child number 0
-------------------------------------
SELECT /*+ INDEX(T2) */   C1,   C2,   C3,   NVL2(C3,NULL,1) C4 FROM
T2 WHERE   C2='D00000000000000'   AND DECODE(C3,NULL,1)=1

Plan hash value: 2303772603

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       |  3752 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T2           |   385 | 14245 |  3752   (1)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | IND_T2_C3_FN | 10000 |       |  3234   (2)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='D00000000000000')
   2 - filter(DECODE(INTERNAL_FUNCTION("C3"),NULL,1)=1) 

The index that we created was used.  Once again, we found that we cannot simply state that “NULL Values are Not Stored in Indexes”.

Let’s drop the index and try just indexing the NULLs (sub-point 2 of point #4 above):

DROP INDEX IND_T2_C3_FN;

CREATE INDEX IND_T2_C3_FN ON T2 (CASE WHEN C3 IS NULL THEN 1 ELSE NULL END);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL HIDDEN COLUMNS SIZE 254 FOR ALL INDEXED COLUMNS SIZE 254',NO_INVALIDATE=>FALSE)

SELECT /*+ INDEX(T2) */
  C1,
  C2,
  C3,
  NVL2(C3,NULL,1) C4
FROM
  T2
WHERE
  C2='D00000000000000'
  AND (CASE WHEN C3 IS NULL THEN 1 ELSE NULL END)=1;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  2w9h5jm4tdzpz, child number 0
-------------------------------------
SELECT /*+ INDEX(T2) */   C1,   C2,   C3,   NVL2(C3,NULL,1) C4 FROM
T2 WHERE   C2='D00000000000000'   AND (CASE WHEN C3 IS NULL THEN 1 ELSE
NULL END)=1

Plan hash value: 941108248

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       | 10024 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T2           |   385 | 15015 | 10024   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T2_C3_FN | 10000 |       |    20   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='D00000000000000')
   2 - access("T2"."SYS_NC00005$"=1) 

The index that we created was used, although notice that the Predicate Information section of the plan is a bit different from before, and this time we have an INDEX RANGE SCAN operation rather than an INDEX FULL SCAN operation.  Once again, we found that we cannot simply state that “NULL Values are Not Stored in Indexes”.

Let’s drop the index and try just indexing the NULLs (sub-point 3 of point #4 above):

DROP INDEX IND_T2_C3_FN;

CREATE INDEX IND_T2_C3_FN ON T2 NVL2(C3,NULL,1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL HIDDEN COLUMNS SIZE 254 FOR ALL INDEXED COLUMNS SIZE 254',NO_INVALIDATE=>FALSE)

SELECT
  C1,
  C2,
  C3,
  NVL2(C3,NULL,1) C4
FROM
  T2
WHERE
  C2='D00000000000000'
  AND NVL2(C3,NULL,1)=1;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  fmsg8vztz32d2, child number 0
-------------------------------------
SELECT   C1,   C2,   C3,   NVL2(C3,NULL,1) C4 FROM   T2 WHERE
C2='D00000000000000'   AND NVL2(C3,NULL,1)=1

Plan hash value: 1513984157

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |  3282 (100)|          |
|*  1 |  TABLE ACCESS FULL| T2   |   385 | 14245 |  3282   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("C2"='D00000000000000' AND NVL2("C3",NULL,1)=1)) 

The index that we created was… not used?  But is it not the case that the NVL2(C3,NULL,1) function result is the equivalent of the DECODE and the CASE function results?  Let’s give this one another try with a hint:

SELECT /*+ INDEX(T2) */
  C1,
  C2,
  C3,
  NVL2(C3,NULL,1) C4
FROM
  T2
WHERE
  C2='D00000000000000'
  AND NVL2(C3,NULL,1)=1;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

SQL_ID  f510adk7fqwyu, child number 0
-------------------------------------
SELECT /*+ INDEX(T2) */   C1,   C2,   C3,   NVL2(C3,NULL,1) C4 FROM
T2 WHERE   C2='D00000000000000'   AND NVL2(C3,NULL,1)=1

Plan hash value: 2303772603

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       |  3740 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T2           |   385 | 14245 |  3740   (1)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | IND_T2_C3_FN | 10000 |       |  3223   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='D00000000000000')
   2 - filter(NVL2("C3",NULL,1)=1) 

That’s better, even if we did receive an INDEX FULL SCAN operation rather than in INDEX RANGE SCAN operation as was the intended result.

Something fun to think about – why did the technique using the CASE syntax that resulted in an INDEX RANGE SCAN operation have a plan with a calculated cost of 10,024, when the plans with the INDEX FULL SCAN operations have a calculated cost of about 3,750?





Full Table Scans and the Buffer Cache in 11.2 – What is Wrong with this Quote?

26 02 2012

February 26, 2012 (Modified February 27, 2012)

I found another interesting quote in the “Oracle Database 11gR2 Performance Tuning Cookbook“, this time related to tables and full table scans.  This quote is found on page 170 of the book:

“If we do an FTS [full table scan], database buffers are used to read all the table data, and this situation may lead to flushing the buffer cache data to make room for the FTS data. To avoid this situation and to limit the consequences on the database buffer cache, the database blocks from FTS operations are put on the top of the LRU (Least Recently Used) list.”

What, if anything, is wrong (and/or right) with the above quote from the book?

Added February 27, 2012:

Part 2:

An additional interesting quote is found on page 176 related to full table scans when indexes are present.  The test case that follows is slightly different than what is presented in the book, however the outcome is the same.  Consider the following table and indexes (note that histograms will be created on columns C1 and C2, and that column C2 will have a single row with a 1 value and 999,999 rows with a 0 value):

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  DECODE(ROWNUM,1,1,0) C2,
  LPAD('A',255,'A') C3
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

CREATE UNIQUE INDEX IND_T1_C1 ON T1(C1);
CREATE INDEX IND_T1_C2 ON T1(C2);

ALTER TABLE T1 MODIFY (C1 NOT NULL, C2 NOT NULL);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL INDEXED COLUMNS SIZE 254') 

The test case script:

SET LINESIZE 120
SET PAGESIZE 1000

SELECT
  C1,
  C2
FROM
  T1
WHERE
  C2=1;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));
 
SELECT
  C1,
  C2
FROM
  T1
WHERE
  C2<>0;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));

The (slightly reformatted) output from my execution of the above script on Oracle Database 11.2.0.2:

SQL> SELECT
  2    C1,
  3    C2
  4  FROM
  5    T1
  6  WHERE
  7    C2=1;

        C1         C2
---------- ----------
         1          1

SQL_ID  8fv30tbr8jdds, child number 0
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   C2=1

Plan hash value: 236868917

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |     1 |     8 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T1_C2 |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C2"=1) 

---

SQL> SELECT
  2    C1,
  3    C2
  4  FROM
  5    T1
  6  WHERE
  7    C2<>0;

        C1         C2
---------- ----------
         1          1

SQL_ID  bu17044puyhkx, child number 0
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   C2<>0

Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |  3049 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |     8 |  3049   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"<>0)

Notice in the above execution plans that the Oracle query optimizer correctly determined that only 1 row would be returned in both cases, yet in the first case an index was used, and in the second case a full table scan.  The book states the following:

“Why did the database optimizer switch back to a long-running FTS operation, instead of the previous Index Range Scan? The answer is simple – indexes cannot be used when we compare values with a not equal operator.”

I have seen the above answer, with slight variations, provided in at least two other books.  What, if anything, is wrong (and/or right) with the above quote from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.





Interesting Index “Facts” – What is Wrong with these Quotes?

22 02 2012

February 22, 2012

There are some amazing (or just plain interesting) facts to be found about how indexes work in Oracle Database.  One Internet site in particular that has been helpful in my understanding of Oracle indexes is Richard Foote’s blog.  I occasionally see information on the Internet, in the Oracle documentation, in Metalink (My Oracle Support), and in books that conflict with one another – just how does one determine what information is correct, and what information is incorrect? 

If you paid money to learn something, should you just assume that what you are learning is correct?  What if test cases are included, should you just assume that the test case is showing what the document claims that the test case shows?  What if, you never ask “What if?” … what if, you do not know when to ask “What if?”

As I mentioned in a previous article, I am in the process of reading and reviewing the book “Oracle Database 11gR2 Performance Tuning Cookbook“, and I seem to be spending quite a bit of time asking What if … or why does this statement not agree with what I believe to be true?  For this blog article, I have pulled seven quotes from the book that will be mentioned in my review of the book – I thought that I would give the readers of this blog an opportunity to ask, “What if”.  As always, I attempted to keep the length of the quotes to a minimum – just long enough so that the context (and meaning) of the quote is not lost.

#1 Page 88:

Faster scans: tables and indexes that require full scans can see faster performance when placed in a large block size.”

#2 Page 119:

“We create the IX3_CUSTOMERS index to show how we can create a descending index, that is, an index which stores the data for one or more fields in descending order…

We can also create a function-based descending index.”

#3 Page 119:

“… this test allows us to dispel a myth. Oracle uses the indexes even if the leading columns are not referenced in the WHERE predicate of the query.  We can see that in such a case, the operation will be an INDEX FAST FULL SCAN.  Conversely, when we reference the leading column of the index, we will end up in an INDEX RANGE SCAN.

#4 Page 120:

“The last type [descending indexes] is useful when we want to use queries in which we want the data sorted in a mixed ascending/descending way on the fields indexed. If we use a regular index to access the data, Oracle is unable to do the sort in a mixed way, in a query like this…  ORDER BY FIELD1 DESC, FIELD2

#5 Page 127:

“However, when we have a table on which there are many INSERTs and DELETEs, we could schedule an index rebuild, because when deleting an index entry, the space is not freed in the index leaf, but just marked as deleted. If we have massive DELETE and INSERT operations, we could have a skewed index structure, which could slow performance due to intra-block fragmentation.” 

#6 Page 127:

“If the value for DEL_LF_ROWS/LF_ROWS is greater than 2, or LF_ROWS is lower than LF_BLKS, or HEIGHT is 4 then the index should be rebuilt.”

#7 Page 139:

“When rows are frequently inserted, deleted, and updated, there is a performance bottleneck if we use a bitmap index. When the index is updated, all the bitmap segments are locked.”

What, if anything, is wrong (and/or right) with the above quotes from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.





Directly Executing a Query is Slower than Returning a REFCURSOR from a PL/SQL Procedure – What is Wrong with this Quote?

20 02 2012

February 20, 2012

My copy of the “Oracle Database 11gR2 Performance Tuning Cookbook” arrived from Amazon, and I will say that I like the seven steps for solving performance problems that is found on page 12, although the diagram of the process on page 14 may lead to a condition known as Compulsive Tuning Disorder.

I am delighted to see that the book makes use of test case scripts, which often allow the reader to demonstrate that the book’s suggestion works in the reader’s Oracle environment.  One such test case is provided on pages 54-55.  I will not reproduce the test case script here, but the script may be found in the download library for the book, in chapter 2’s 2602_02_StoredProcedure.sql file.  The download library for the book may be found by selecting the book from this page, and entering your email address.

After connecting to the database:

sqlplus /@TESTDB AS SYSDBA

The test case script compares the performance of this SQL statement that is directly executed in SQL*Plus:

SELECT PROD_ID, SUM(AMOUNT_SOLD) FROM SH.SALES GROUP BY PROD_ID; 

With the performance of returning a SYS_REFCURSOR to a SQL*Plus variable that is declared as a REFCURSOR datatype.  The book shows that the SQL statement executed directly in SQL*Plus required 1.26 seconds, while the method using the REFCURSOR required just 0.45 seconds, with the obvious extension being that the second method is more efficient.  I tried the test case, and found that the normal SQL statement executed in SQL*Plus required 0.47 seconds, and the REFCURSOR method required just 0.26 seconds, so my results are consistent with those from the book – the REFCURSOR method shows less Elapsed time just as stated in the book.

What, if anything, is wrong with the above quote (test case) from the book?

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true. It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past). If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.





Tables and Their Indexes Should be Located on Different Disks – What is Wrong with this Quote?

17 02 2012

February 17, 2012

As I mentioned in a previous blog article, I placed an order for the book “Oracle Database 11gR2 Performance Tuning Cookbook“, mostly because I was curious how the book would compare with the “Oracle Database 11g Performance Tuning Recipes” book (my review of that book was a total of about 35 typewritten pages in length), and some of the other Oracle performance related books that I have reviewed.

After placing my order for the book, I took a quick look at a couple of the book’s preview pages on Amazon, downloaded the sample chapter from the book publisher – chapter 10 “Tuning I/O”, and spent a bit of time reading the sample chapter (making comments in the margins of the printed pages).  I circled the following paragraph (sorry for the long quote – I want to make certain that the context of the quote is not lost), found on page 422 of the book (page 17 in the PDF download):

“We can also distribute tables and related indexes on different disks, to obtain performance gain in both read and write operations. If we have tables and indexes on the same disk, we need to read and write in two different places on the same disk. By dividing the work between two disks, we can perform an index range scan faster than when the index and the table are on the same disk.”

What, if anything, is wrong with the above quote from the book?  For some reason, the acronym MASE came to mind when I read the above paragraph. 

The point of blog articles like this one is not to insult authors who have spent thousands of hours carefully constructing an accurate and helpful book, but instead to suggest that readers investigate when something stated does not exactly match what one believes to be true.  It could be that the author “took a long walk down a short pier”, or that the author is revealing accurate information which simply cannot be found through other resources (and may in the process be directly contradicting information sources you have used in the past).  If you do not investigate in such cases, you may lose an important opportunity to learn something that could prove to be extremely valuable.








Follow

Get every new post delivered to your Inbox.

Join 143 other followers