August 14, 2010
My replacement laptop arrived this week to fix an occasional problem that I had with the old laptop:
I thought that I would perform a test on the new laptop, so I thought that I would try a slightly modified version of the script that appeared in the “Expert Oracle Practices” book in the Buffer Cache Hit Ratio section of chapter 8. The table creation part of the script tends to be very CPU intensive for an hour or two, so if you want to try the script, set aside plenty of time.
SET LINESIZE 150 SET PAGESIZE 2000 SET TRIMSPOOL ON SET TIMING ON SET ARRAYSIZE 100 SPOOL C:\BCHRTestShortAsync11Gr2.txt CREATE TABLE T1 ( ID NUMBER, DESCRIPTION VARCHAR2(80)); INSERT INTO T1 SELECT CEIL(ABS(SIN(ROWNUM/9.9999)*10000)), 'This is the long description for this number '|| TO_CHAR(CEIL(ABS (SIN(ROWNUM/9.9999)*10000))) FROM (SELECT ROWNUM RN FROM DUAL CONNECT BY LEVEL<=10000), (SELECT ROWNUM RN FROM DUAL CONNECT BY LEVEL<=10000); COMMIT; CREATE INDEX IND_T1 ON T1(ID); EXEC DBMS_STATS.GATHER_TABLE_STATS (OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE, METHOD_OPT=>'FOR ALL COLUMNS SIZE 1') ANALYZE INDEX IND_T1 VALIDATE STRUCTURE; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH SHARED_POOL; SET AUTOTRACE TRACEONLY EXPLAIN ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100; SELECT ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400; ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5; SELECT ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400; SET ARRAYSIZE 100 SET AUTOTRACE TRACEONLY STATISTICS SET TIMING ON ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE100'; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12'; ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100; SELECT ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF'; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_LARGE5'; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12'; ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5; SELECT ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400; SPOOL OFF
The script builds a table that has 100,000,000 rows, displays the execution plan for a SQL statement that selects a small percent of the table when the OPTIMIZER_INDEX_COST_ADJ is set to 100 and again at 5, and finally excutes the SQL statements with a 10046 trace enabled.
A screen capture during the table creation with Oracle Database 11.2.0.1 (SQL*Plus Windows 7 Task Manager, Windows 7 Resource Monitor):
Note that the laptop only managed to achieve 3MB per second while creating the table – that was a fairly consistent average during the table creation that required 40 minutes and 29 seconds.
The execution plans generated are as follows:
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100;
Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4000K| 206M| 222K (1)| 00:44:29 | |* 1 | TABLE ACCESS FULL| T1 | 4000K| 206M| 222K (1)| 00:44:29 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID"<=400 AND "ID">=1)
——–
ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5;
Execution Plan ---------------------------------------------------------- Plan hash value: 634656657 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4000K| 206M| 202K (1)| 00:40:25 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 4000K| 206M| 202K (1)| 00:40:25 | |* 2 | INDEX RANGE SCAN | IND_T1 | 4067K| | 427 (1)| 00:00:06 | -------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=1 AND "ID"<=400)
So, the optimizer is predicting that an execution using a full table scan will complete in 44 minutes and 29 seconds, while an execution using an index range scan will complete in 40 minutes and 25 seconds. Let’s see how the different execution plans performed.
Full Table Scan:
SQL> ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100; Session altered. Elapsed: 00:00:00.00 SQL> SQL> SELECT 2 ID, 3 DESCRIPTION 4 FROM 5 T1 6 WHERE 7 ID BETWEEN 1 AND 400; 2547158 rows selected. Elapsed: 00:00:14.39 Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 838515 consistent gets 813246 physical reads 0 redo size 140941880 bytes sent via SQL*Net to client 280701 bytes received via SQL*Net from client 25473 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2547158 rows processed
—-
Index Range Scan:
SQL> ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=5; Session altered. Elapsed: 00:00:00.00 SQL> SQL> SELECT 2 ID, 3 DESCRIPTION 4 FROM 5 T1 6 WHERE 7 ID BETWEEN 1 AND 400; 2547158 rows selected. Elapsed: 00:01:24.36 Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2573595 consistent gets 818246 physical reads 0 redo size 17418653 bytes sent via SQL*Net to client 280701 bytes received via SQL*Net from client 25473 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2547158 rows processed
The full table scan when selecting 2.55% of the rows in the table required 14.39 seconds while the index range scan required 1 minute 24.36 seconds. I guess that is not a bad improvement for leaving the OPTIMIZER_INDEX_COST_ADJ parameter at the default value of 100. But wait, the full table scan was CPU bound, not IO bound! What if we allow the SQL statement to use more than 1 CPU?
SET ARRAYSIZE 100 SET AUTOTRACE TRACEONLY STATISTICS SET TIMING ON ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SESSION SET TRACEFILE_IDENTIFIER='IND_COST_PARALLEL100'; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12'; ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=100; SELECT /*+ PARALLEL(T1, 8 ) */ ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400;
Here is a screen capture that was captured as the full table scan ended (I had to execute the above several times so that the screen capture showed the very end of the execution):
That helped a little, the full table scan completed in 12.01 seconds and used about 24% of the CPU capacity. Interesting that the screen capture shows that the network utilization was 207Mb/s – the laptop was connected to a wireless connection with Oracle Database 11.2.0.1 running locally. The full table scan might be IO bound now (note that there were only two datafiles, so I supposed that we could still be CPU bound if only two parallel processes could be used).
Let’s see the storage requirements for the table and index (ASSM AUTO tablespace):
SET AUTOTRACE OFF SELECT SEGMENT_NAME SEGMENT, COUNT(*) EXTENTS, BYTES/1024 EXT_SIZE_KB, (COUNT(*) * BYTES)/1048576 TOTAL_MB FROM DBA_EXTENTS WHERE OWNER=USER AND SEGMENT_NAME IN ('IND_T1','T1') GROUP BY SEGMENT_NAME, BYTES ORDER BY SEGMENT_NAME, BYTES; SEGMENT EXTENTS EXT_SIZE_KB TOTAL_MB ---------- ---------- ----------- ---------- IND_T1 16 64 1 IND_T1 63 1,024 63 IND_T1 120 8,192 960 IND_T1 10 65,536 640 T1 16 64 1 T1 63 1,024 63 T1 120 8,192 960 T1 1 60,416 59 T1 83 65,536 5,312
It might be somewhat interesting to mention that somehow we ended up with one odd 59MB extent in the table, otherwise the extent allocation followed the typical pattern on 16 extents at 64KB, 63 extents at 1MB, 120 extents at 8MB, and then 64MB extents.
Let’s take a quick peek at a couple of the index statistics:
SELECT HEIGHT, BLOCKS, LF_BLKS, LF_ROWS, DISTINCT_KEYS, MOST_REPEATED_KEY, PCT_USED FROM INDEX_STATS WHERE NAME='IND_T1'; HEIGHT BLOCKS LF_BLKS LF_ROWS DISTINCT_KEYS MOST_REPEATED_KEY PCT_USED ------ ---------- ---------- ----------- ------------- ----------------- ---------- 3 212992 208854 100,000,000 10,000 900,324 90
–
SELECT INDEX_TYPE, BLEVEL, LEAF_BLOCKS, DISTINCT_KEYS, AVG_LEAF_BLOCKS_PER_KEY, AVG_DATA_BLOCKS_PER_KEY, CLUSTERING_FACTOR, SAMPLE_SIZE FROM USER_INDEXES WHERE INDEX_NAME='IND_T1'; INDEX_TYPE BLEVEL LEAF_BLOCKS DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY AVG_DATA_BLOCKS_PER_KEY CLUSTERING_FACTOR SAMPLE_SIZE ---------- ---------- ----------- ------------- ----------------------- ----------------------- ----------------- ----------- NORMAL 2 212,350 10,000 21 10,074 100,749,838 556,356
Interesting – the clustering factor is higher than the number of rows in the table – maybe we should have sampled 100% of the table and index when collecting statistics.
A quick look at the table statistics:
SELECT NUM_ROWS, BLOCKS, AVG_ROW_LEN, SAMPLE_SIZE FROM USER_TABLES WHERE TABLE_NAME='T1'; NUM_ROWS BLOCKS AVG_ROW_LEN SAMPLE_SIZE ----------- ---------- ----------- ----------- 100,000,000 817,275 54 100,000,000
It might be a good idea to take a look at the TKPROF summaries for the executions. Just as a reminder, the normal (non-parallel, serial) table scan required 14.39 seconds, the parallel table scan required 12.01 seconds, and the index range scan required 1 minute 24.36 seconds.
TKPROF output full table scan: elapsed time 6.18 seconds (8.21 seconds less than what was reported by SQL*Plus) with 5.28 seconds of CPU time:
SELECT ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 25473 5.28 6.18 813246 838515 0 2547158 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 25475 5.28 6.18 813246 838515 0 2547158 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 185 Rows Row Source Operation ------- --------------------------------------------------- 2547158 TABLE ACCESS FULL T1 (cr=838515 pr=813246 pw=0 time=6130508 us cost=222369 size=216021546 card=4000399) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 25473 0.00 0.02 db file sequential read 3 0.00 0.00 direct path read 4080 0.00 0.91 SQL*Net message from client 25473 0.00 7.00 db file scattered read 1 0.00 0.00
TKPROF output parallel full table scan: elapsed time 3.42 seconds (8.59 seconds less than what was reported by SQL*Plus) with 1.63 seconds of CPU time:
SELECT /*+ PARALLEL(T1, 8 ) */ ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.04 11 25 0 0 Fetch 25473 1.63 3.42 0 0 0 2547158 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 25475 1.63 3.47 11 25 0 2547158 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 185 Rows Row Source Operation ------- --------------------------------------------------- 2547158 PX COORDINATOR (cr=25 pr=11 pw=0 time=2345038 us) 0 PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us cost=30853 size=216021546 card=4000399) 0 PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us cost=30853 size=216021546 card=4000399) 0 TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 time=0 us cost=30853 size=216021546 card=4000399) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ db file sequential read 3 0.00 0.00 db file scattered read 1 0.00 0.00 os thread startup 8 0.00 0.02 PX Deq: Join ACK 8 0.00 0.00 PX Deq: Parse Reply 8 0.00 0.01 SQL*Net message to client 25473 0.00 0.02 PX Deq: Execute Reply 5495 0.00 1.92 SQL*Net message from client 25473 0.00 7.32 latch free 1 0.00 0.00 PX qref latch 1 0.00 0.00 PX Deq: Signal ACK RSG 1 0.00 0.00 PX Deq: Signal ACK EXT 7 0.00 0.00 PX Deq: Slave Session Stats 3 0.00 0.00 enq: PS - contention 1 0.00 0.00
TKPROF output index range scan: elapsed time 76.85 seconds (16.51 seconds less than what was reported by SQL*Plus) with 29.28 seconds of CPU time:
SELECT ID, DESCRIPTION FROM T1 WHERE ID BETWEEN 1 AND 400 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 25473 29.28 76.85 818246 2573595 0 2547158 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 25475 29.28 76.85 818246 2573595 0 2547158 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 185 Rows Row Source Operation ------- --------------------------------------------------- 2547158 TABLE ACCESS BY INDEX ROWID T1 (cr=2573595 pr=818246 pw=0 time=31008522 us cost=202004 size=216021546 card=4000399) 2547158 INDEX RANGE SCAN IND_T1 (cr=30646 pr=5238 pw=0 time=2313805 us cost=427 size=0 card=4067270)(object id 77111) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 25473 0.00 0.03 db file sequential read 81689 0.10 32.96 SQL*Net message from client 25473 0.00 6.11 db file parallel read 36586 0.87 26.09
Numbers from the twilight zone? But wait, there is more. The numbers were so illogical that I took the database offline, rebooted, and was immediately greeted with a warning that the RAID 0 array in the new laptop failed! Try this experiement at your own risk.
————————————-
So, what have we learned from this blog article? To get you started, what was the buffer cache hit ratio for each of the executions? How can you explain the decrease in CPU time when parallel execution was used?
—-
Edit August 14, 2010: In case you are wondering, the Oracle database was running on the laptop, not a remote server. The 11.2.0.1 database was configured with Oracle’s default parameters, with the exception of SGA_TARGET which was set to 12G and PGA_AGGREGATE_TARGET which was set to 1800M. Oracle Database on the Windows platform supports only direct, asynchronous I/O. Windows 7 Ultimate running on the laptop does not offer large memory page support, unlike the Server type Windows operating systems.
Recent Comments