May 30, 2012
A question appeared on the OTN Database forums yesterday that has yet to receive an answer. The question was essentially, why did the execution plan change when the OPTIMIZER_FEATURES_ENABLED parameter was adjusted from 11.2.0.2 to 11.1.0.7, and why did the execution performance improve as a result of making the change? A DBMS_XPLAN generated execution plan where TYPICAL +OUTLINE were apparently specified for the format parameters was provided for the execution with the OPTIMIZER_FEATURES_ENABLED parameter set to 11.2.0.2, while the format parameter was apparently set to ALLSTATS LAST for the 11.1.0.7 test execution. Why did the performance improve when the optimizer’s parameter was modified?
We do not have access to the SQL statement, but we do have a few clues. Take a look at the Predicate Information section, you will see the following:
1 - filter(ROWNUM<=1000)
So, ROWNUM<=1000 was specified in the SQL statement. As mentioned in one of my other articles, there was a bug in the optimizer’s code prior to Oracle Database 11.2.0.1 related to the appearance of ROWNUM in the WHERE clause. Would changing the OPTIMIZER_FEATURES_ENABLED parameter to 11.1.0.7 reinstate the bug (this would not happen based on testing that I performed some time ago)? The OP stated that the OPTIMIZER_MODE parameter is set to FIRST_ROWS_10, could specifying ROWNUM<=1000 cause the optimizer to behave as if FIRST_ROWS_1000 was set for the optimizer mode, behaving differently depending on the value of the OPTIMIZER_FEATURES_ENABLED parameter?
What other clues did the OP provide for potential readers of the OTN thread? Let’s take a look at a portion of the outline sections supplied with the execution plans to see if there are any clues. With OPTIMIZER_FEATURES_ENABLED set to 11.2.0.2:
OPTIMIZER_FEATURES_ENABLE('11.2.0.2') DB_VERSION('11.2.0.2') OPT_PARAM('_push_join_union_view' 'false') OPT_PARAM('_optim_peek_user_binds' 'false') OPT_PARAM('_gby_hash_aggregation_enabled' 'false') OPT_PARAM('optimizer_index_cost_adj' 10) OPT_PARAM('optimizer_index_caching' 80) FIRST_ROWS(10)
With OPTIMIZER_FEATURES_ENABLED set to 11.1.0.7:
OPTIMIZER_FEATURES_ENABLE('11.1.0.7') DB_VERSION('11.2.0.2') OPT_PARAM('_optim_peek_user_binds' 'false') OPT_PARAM('optimizer_index_cost_adj' 10) OPT_PARAM('optimizer_index_caching' 80) FIRST_ROWS(10)
Well, that is a bit interesting. The OPTIMIZER_INDEX_COST_ADJ parameter was set to 10 in both cases, and the OPTIMIZER_INDEX_CACHING parameter was set to 80 in both cases – could the change in the execution plan be caused by a cost rounding error that is somehow slightly different when the OPTIMIZER_FEATURES_ENABLED parameter is changed?
While we are looking at the outline sections, notice that in the test with the OPTIMIZER_FEATURES_ENABLED parameter set at 11.2.0.2 that the _OPTIM_PEEK_USER_BINDS, _PUSH_JOIN_UNION_VIEW, and _GBY_HASH_AGGREGATION_ENABLED hidden parameters are all set at the non-default value of FALSE, while the test with the OPTIMIZER_FEATURES_ENABLED parameter set at 11.1.0.7 only shows that the _OPTIM_PEEK_USER_BINDS parameter is set to the non-default value of FALSE – could that difference cause of the execution plan change, and might those two hidden parameters magically change back to their default values when adjusting the OPTIMIZER_FEATURES_ENABLED parameter?
The OP did not supply a test case script, so I will create a simple test case script just to see if those two hidden parameters might magically change back to their default values when the OPTIMIZER_FEATURES_ENABLED parameter is adjusted.
DROP TABLE T1 PURGE; CREATE TABLE T1 AS SELECT ROWNUM C1, DECODE(ROWNUM,1,1,0) C2, LPAD('A',255,'A') C3 FROM DUAL CONNECT BY LEVEL<=10000; CREATE UNIQUE INDEX IND_T1_C1 ON T1(C1); CREATE INDEX IND_T1_C2 ON T1(C2); ALTER TABLE T1 MODIFY (C1 NOT NULL, C2 NOT NULL); EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR ALL INDEXED COLUMNS SIZE 254') SET PAGESIZE 1000 SET LINESIZE 140
It appears that the OP had the following parameters set for the execution at 11.2.0.2:
ALTER SESSION SET "_push_join_union_view"=false; ALTER SESSION SET "_optim_peek_user_binds"=false; ALTER SESSION SET "_gby_hash_aggregation_enabled"=false; ALTER SESSION SET optimizer_index_cost_adj=10; ALTER SESSION SET optimizer_index_caching=80; ALTER SESSION SET OPTIMIZER_MODE=FIRST_ROWS_10;
Let’s try a test script (note that you may wish to change the event 10132 to 10053, because the output of the 10132 trace event seems to be a bit scrambled in Oracle Database 11.2.0.2):
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='11.2.0.2'; ALTER SESSION SET TRACEFILE_IDENTIFIER='11.2.0.2 HIDDEN TEST'; ALTER SESSION SET EVENTS '10132 TRACE NAME CONTEXT FOREVER, LEVEL 1'; SELECT * FROM T1 WHERE ROWNUM<=1000 AND C2=2 AND C1>=1; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'+OUTLINE')); ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='11.1.0.7'; ALTER SESSION SET TRACEFILE_IDENTIFIER='11.1.0.7 HIDDEN TEST'; SELECT * FROM T1 WHERE ROWNUM<=1000 AND C2=2 AND C1>=1; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'+OUTLINE')); ALTER SESSION SET EVENTS '10132 TRACE NAME CONTEXT OFF';
Here is the output that I received for 11.2.0.2:
SQL> SELECT 2 * 3 FROM 4 T1 5 WHERE 6 ROWNUM<=1000 7 AND C2=2 8 AND C1>=1; no rows selected SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'+OUTLINE')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------ SQL_ID fb2ysnjkbq366, child number 0 ------------------------------------- SELECT * FROM T1 WHERE ROWNUM<=1000 AND C2=2 AND C1>=1 Plan hash value: 1016793414 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 1 (100)| | |* 1 | COUNT STOPKEY | | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 136 | 1 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IND_T1_C2 | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.2.0.2') DB_VERSION('11.2.0.2') OPT_PARAM('_push_join_union_view' 'false') OPT_PARAM('_optim_peek_user_binds' 'false') OPT_PARAM('_gby_hash_aggregation_enabled' 'false') OPT_PARAM('optimizer_index_cost_adj' 10) OPT_PARAM('optimizer_index_caching' 80) FIRST_ROWS(10) OUTLINE_LEAF(@"SEL$1") INDEX_RS_ASC(@"SEL$1" "T1"@"SEL$1" ("T1"."C2")) END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(ROWNUM<=1000) 2 - filter("C1">=1) 3 - access("C2"=2)
The output for 11.1.0.7:
SQL_ID fb2ysnjkbq366, child number 1 ------------------------------------- SELECT * FROM T1 WHERE ROWNUM<=1000 AND C2=2 AND C1>=1 Plan hash value: 1016793414 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 1 (100)| | |* 1 | COUNT STOPKEY | | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 136 | 1 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IND_T1_C2 | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('11.1.0.7') DB_VERSION('11.2.0.2') OPT_PARAM('_push_join_union_view' 'false') OPT_PARAM('_optim_peek_user_binds' 'false') OPT_PARAM('_gby_hash_aggregation_enabled' 'false') OPT_PARAM('optimizer_index_cost_adj' 10) OPT_PARAM('optimizer_index_caching' 80) FIRST_ROWS(10) OUTLINE_LEAF(@"SEL$1") INDEX_RS_ASC(@"SEL$1" "T1"@"SEL$1" ("T1"."C2")) END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(ROWNUM<=1000) 2 - filter("C1">=1) 3 - access("C2"=2)
—
How would you answer the OP? Could it just be the case that most of the required database blocks were in the buffer cache on the second execution when the OPTIMIZER_FEATURES_ENABLED parameter was adjusted? If nothing else, this OTN thread provides an excuse to think about possible causes and effects when insufficient information is provided.
Thanks Charles for putting my OTN thread here. I am sure it will catch some interesting eye.
Today i tried setting optimizer_features_enable=11.2.0.2 on 11gr1 db and got index skip scan plan (bad plan in 11.2.0.2)
Giving you sql
GC_STATUS_ID=7 is ~ 13k in both db.
I am trying to understand what is causing GC_TRANSACTIONS as leading table during join with optimizer_features_enable= 11.2.0.2.
Can you shed more details on ROWNUM bug you have mentioned initially ?
I changed optimizer_mode=first_rows_1000 in 11.2.0.2 but could not get good plan
10053 trace on 11.2.0.2 says following :
Even though it selects PK_GIFT_CERTIFICATES index in execution plan.
Regarding the ROWNUM bug, see Metalink (MOS) Doc ID 6845871.8, Bug 6845871 “Suboptimal plan from ROWNUM predicate”:
https://supporthtml.oracle.com/epmos/faces/ui/km/SearchDocDisplay.jspx?_afrLoop=4081001352865000&type=DOCUMENT&id=6845871.8
You can see the effects of that bug in the following OTN forum thread:
https://forums.oracle.com/forums/thread.jspa?threadID=934895&start=0&tstart=0
If you look through a 10053 trace, you should see “fix 6845871 = enabled” in the “Bug Fix Control Environment” list – that is the fix for the ROWNUM cardinality bug. In theory, setting OPTIMIZER_FEATURES_ENABLED to 11.1.0.7 should change “fix 6845871 = enabled” to “fix 6845871 = disabled” to be consistent with the behavior of the 11.1.0.7 optimizer, but that does not seem to happen.
If I am reading your OTN post correctly, you are attempting to compare performance between two entirely different databases (and of course database instances).
1. The statistics (table and/or index) statistics could be different – something as simple as a histogram on the GC_TRANSACTIONS.TRANSACTION_DATE_UTC column in one of the databases but not the other could be the cause.
2. Different system statistics for the two databases could be a secondary cause, you can check with:
3. The optimizer parameters are different for the two databases. _push_join_union_view and _gby_hash_aggregation_enabled are set to non-default values in the 11.2.0.2 test, while still set to their defaults in the 11.1.0.7 test. There may be other parameter differences that are not obvious.
4. How did you make the execution plan show the “A-Rows” and “A-Time” columns in one database, but not in the other? Was the STATISTICS_LEVEL parameter changed from TYPICAL to ALL in one of the databases, but not the other – that change can impact performance, I suggest verifying that the STATISTICS_LEVEL is set to TYPICAL in both databases.
To make certain that we are comparing consistent information, add a GATHER_PLAN_STATISTICS hint immediately after the first SELECT keyword, like this:
Execute the SQL statement twice (that way we pre-load the buffer cache for the second execution, or to Tony’s point, you can flush the BUFFER_CACHE and execute the SQL statement a single time), then display the execution plan with the following commands:
While not a good starting point, keep in mind that you can hint the order of the tables with the LEADING hint (this hint would need to appear after the second SELECT keyword in this SQL statement) and you can hint the OPTIMIZER_FEATURES_ENABLED parameter (this hint would need to appear after the first SELECT keyword in this SQL statement).
Why are you using OPTIMIZER_MODE=FIRST_ROWS_10 rather than ALL_ROWS? Please see the following article: http://jonathanlewis.wordpress.com/2008/11/11/first_rows_n/
I think the OP has not supplied the full story.
The OP should run your test case and validate the results.
Your test case didn’t flush the buffer cache for the second run, so it may be that in your test the data was already in the cache. A flush of the buffer cache prior to the second run would answer that question.
Thanks Charles.
I have 6845871=ENABLE in both dbs. I have confirmed this by looking 10053 in both.
1) Histograms on GC_TRANSACTIONS.TRANSACTION_DATE_UTC exists (254 buckets) on both dbs.
2)
11.1.0.7
11.2.0.2
3) I played arround with _push_join_union_view and _gby_hash_aggregation_enabled to produce good plan but it did not help.
4) i had to cancel execution in 11.2.0.2 after 10 min. Thats the reason why i did not put ‘GATHER_PLAN_STATISTICS’ plan for 11.2.0.2
5) With leading hint ( /*+leading(giftcertif0_) */ ) in 11.2.0.2 i got below plan
I had to put additional index(gctransact1_,I_GCT_GC_ID) hint with leading hint to produe 11.1.0.7 (good) plan
I will definitly read all_rows optimization. It did not change the plan in my case though.
Bhavik,
Thank you for posting the above information.
The posted execution plan with the LEADING HINT is interesting. Notice that the optimizer has introduced a HASH JOIN operation in place of the two NESTED LOOPS operation, and that it is predicting that only a single row in the GC_TRANSACTIONS table has a GC_TRANSACTION_TYPE_ID of 11 based on the row in the plan with an ID of 7 – usage of that index appears to be quite inefficient due to the calculated cost of 9541.
You may also notice a bug in the execution plan, where the cost of accessing the I_GCT_CUSMKTLSTUPD index is the same cost as that of the parent operation (the table GC_TRANSACTIONS) – the optimizer is NOT taking into account the cost of the table access which should increase the calculated cost by at least 1 per expected row returned. I think that I have a blog article on this site that describes this problem which seemed to be related to collecting statistics for the indexes and tables when the tables contain no rows (and thus do not have a data segment yet by default due to deferred segment creation), one that was identified by Randolf Geist – I will have to do some searching to find the article.
The decision to use the INDEX SKIP SCAN operation on the I_GCT_CUSMKTLSTUPD index may be either due to a bug, adjusted skip scan costing in 11.2.0.2, or the setting of optimizer_index_cost_adj=10 that essentially causes the optimizer to take the calculated cost of the index skip scan at 95,410, multiply that value by 0.10, and determine that the index skip scan is efficient. The cardinality estimate of 1 is the reason why, by default, that row source is selected as the driving table in the join.
It appears that both of the database instances are using NOWORKLOAD system statistics (although with a different value in the FLAGS column), with one of the instances having an estimated calculated CPU processing speed that is roughly 3 times faster than the other – that is enough of a change by itself to affect the decisions made by the optimizer. Randolf Geist has a very good series of articles that describe the affects of NOWORKLOAD and WORKLOAD system statistics, and he also shows how to alter those system statistics. You will need to force a hard parse by modifying the SQL statement (with a comment or additional white space) to see if adjusting the CPUSPEEDNW system statistic to a value of 522 in the 11.2.0.2 database causes a desired change in the execution plan (keep in mind, that like the OPTIMIZER_INDEX_COST_ADJ parameter, that this is a global change that could negatively affect other SQL statements). You can access Randolf’s article series here:
http://oracle-randolf.blogspot.com/2009/05/understanding-different-modes-of-system.html
Randolf’s blog also contains an article that describes costing calculation changes that were introduced in Oracle Database 11.2.0.1 and 11.2.0.2 – these changes may still affect the 11.2.0.2 database when the OPTIMIZER_FEATURES_ENABLE parameter is set to 11.1.0.7:
http://oracle-randolf.blogspot.com/2011/07/cost-is-time-next-generation.html
Charles,
Would this query benefit from the GATHER_PLAN_STATISTICS hint and an ‘ALLSTATS LAST’, it might be useful to see the correlation between estimated and actual rows returned.
The prediction that only a single row in the GC_TRANSACTIONS table has a GC_TRANSACTION_TYPE_ID of 11 might not be a true representation. As I understand, the optimzer will return an estimated number of rows as 1 if it determines that there are no rows to be returned. I think that might be to eliminate the possibility of divide by zero errors in the CBO calculations. If there are expected to be no rows returned, that would strengthen your argument around the cost not being adjusted for table access. It may be that a histogram for the GC_TRANSACTION_TYPE_ID column may indicate a better data distribution for that column and hence change the index cost.
Tony,
I was thinking something similar regarding the hint. Without seeing the actual number of rows returned by the operations, it is difficult to see at what point the plan becomes wrong. The OP stated the following about attempting to use the hint:
Unfortunately, the query must complete for the GATHER_PLAN_STATISTICS hint and ‘ALLSTATS LAST’ formatted DBMS_XPLAN to work properly.
I was under the impression that if the optimizer estimated that 0 rows would be returned, that the optimizer would use a value of 1 so that multiplication of the estimated cardinality would not cause the calculated cost to become 0 – thinking about it, avoiding infinity (division by 0) could be a possibility too.
Charles/Tony
I ran Charles’s test on my both instances and it gave exactly the same plan as Charles’s had mentioned here. I got same plan for both opt_feat versions in each db.
[…] Plan Changes when the OPTIMIZER_FEATURES_ENABLED Parameter is Changed, But Why? Charles Hooper […]