## Parallel Reasoning

17 02 2010

February 17, 2010 (Updated October 7, 2010: 11.2 documentation links fixed)

This article is a light-weight exploration of parallel query which is available in the Enterprise Edition of Oracle Database.  There are a lot of great articles on the Internet that discuss parallel query benefits and pitfalls, and Metalink contains a couple of decent articles also.  For example, Metalink Doc ID 263153.1 includes the following bit of advice:

“The parallel hint on the object make sure that we access this object in parallel.”

Neat.  Let’s see if it’s true.  We will experiment with the test table found in this blog article, being sure to collect statistics on all of the columns, and executing the test queries on Oracle Database 11.2.0.1 using my VBS tool for automatically generating execution plans.  Here is the query that will be executed:

SELECT /*+ PARALLEL(2) */
*
FROM
T1
WHERE
C1 BETWEEN 1 AND 10000;

The following execution plan was generated for the query:

SQL_ID  gjcx2fjkwt7xc, child number 0
-------------------------------------
SELECT /*+ PARALLEL(2) */    *  FROM    T1  WHERE    C1 BETWEEN 1 AND
10000

Plan hash value: 2332705540

-----------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |       |       |   343 (100)|          |        |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           | 10000 |  2236K|   343   (0)| 00:00:05 |  OR112 |
|*  2 |   INDEX RANGE SCAN          | SYS_C0018049 | 10000 |       |    20   (0)| 00:00:01 |  OR112 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C1">=1 AND "C1"<=10000)

Note
-----
- Degree of Parallelism is 1 because of hint

That’s a strange note at the end of the execution plan.  Maybe Doc ID 263153.1 does not apply to Oracle 11.2.0.1, because the query did not execute in parallel?  Let’s try again:

SELECT /*+ PARALLEL(3) */
*
FROM
T1
WHERE
C1 BETWEEN 1 AND 10000;

SQL_ID  45mgp20qp9rfu, child number 0
-------------------------------------
SELECT /*+ PARALLEL(3) */    *  FROM    T1  WHERE    C1 BETWEEN 1 AND
10000

Plan hash value: 2494645258

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   329 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  2236K|   329   (0)| 00:00:04 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  2236K|   329   (0)| 00:00:04 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  2236K|   329   (0)| 00:00:04 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access(:Z>=:Z AND :Z<=:Z)
filter(("C1"<=10000 AND "C1">=1))

Note
-----
- Degree of Parallelism is 3 because of hint

Well, that’s odd.  Doc ID 263153.1 first doesn’t apply to Oracle 11.2.0.1, and then does apply to Oracle 11.2.0.1.  Notice that the calculated total query cost of the first plan is 343 and the calculated total query cost of the second plan is 329.  Time for another test:

SELECT /*+ PARALLEL(4) */
*
FROM
T1
WHERE
C1 BETWEEN 1 AND 10000;

SQL_ID  1m142v1crjs2j, child number 0
-------------------------------------
SELECT /*+ PARALLEL(4) */    *  FROM    T1  WHERE    C1 BETWEEN 1 AND
10000

Plan hash value: 2494645258

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   247 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access(:Z>=:Z AND :Z<=:Z)
filter(("C1"<=10000 AND "C1">=1))

Note
-----
- Degree of Parallelism is 4 because of hint

The calculated total query cost decreased to 247.  OK, if we keep bumping up the parallel degree, the calculated cost becomes lower, so we might as well get it over with and do this:

SQL_ID  9xccz9nu4hrqx, child number 0
-------------------------------------
SELECT /*+ PARALLEL(1000) */    *  FROM    T1  WHERE    C1 BETWEEN 1
AND 10000

Plan hash value: 2494645258

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |     2 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  2236K|     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  2236K|     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  2236K|     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access(:Z>=:Z AND :Z<=:Z)
filter(("C1"<=10000 AND "C1">=1))

Note
-----
- Degree of Parallelism is 1000 because of hint

There, we now have a cost of 2 – the query should now finish instantaneously.  :-)  But why didn’t the query excute in parallel when PARALLEL(2) was specified in the hint?  A search of the Internet provides the answer:

Hints on Hinting (page 2)

“The hint parallel(t1, N) does not tell Oracle to run the query in parallel at degree N – it tells the optimizer that when it is calculating the cost of a tablescan on table t1 it should cost for parallel degree N, and if that plan then happens to be the cheapest of all the plans to consider that that’s the plan to use.”

Hints – Again

“But the hint we supplied is only relevant to the tablescan – and we can see it being used in the arithmetic above: the cost_io is the serial I/O cost of the tablescan, the resp_io is the ‘parallel response’ time for the query which is (cost_io / (0.9 * degree)).”

Calculated costs of the access paths matter… so that’s why the first query did not execute in parallel.

For those who decide to rebuild indexes for whatever reason, and decide to do so in parallel, you might be setting yourself up for headaches:

A Gotcha with Parallel Index Builds, Parallel Degree and Query Plans

I haven’t enabled parallelism for my table! How can Oracle go parallel without my consent?  Parallel index (re)build will persistently set the index parallel degree in data dictionary to the value used during build!

Oracle 11.2.0.1 introduced a couple of changes to parallel query, and a couple of those changes are listed below:

“Beginning with Oracle Database 11g Release 2, the PARALLEL and NO_PARALLEL hints are statement-level hints and supersede the earlier object-level hints: PARALLEL_INDEX, NO_PARALLEL_INDEX, and previously specified PARALLEL and NO_PARALLEL hints. For PARALLEL, if you specify integer, then that degree of parallelism will be used for the statement. If you omit integer, then the database computes the degree of parallelism. All the access paths that can use parallelism will use the specified or computed degree of parallelism.”

“When the parameter PARALLEL_DEGREE_POLICY is set to AUTO, Oracle Database automatically decides if a statement should execute in parallel or not and what DOP it should use. Oracle Database also determines if the statement can be executed immediately or if it is queued until more system resources are available. Finally, it decides if the statement can take advantage of the aggregated cluster memory or not.

The following example illustrates computing the DOP the statement should use:

SELECT /*+ parallel(auto) */ ename, dname FROM emp e, dept d
WHERE e.deptno=d.deptno;

The following example forces the statement to use Oracle Database 11g, Release 1 (11.1) behavior:

SELECT /*+ parallel(manual) */ ename, dname FROM emp e, dept d WHERE e.deptno=d.deptno;"

Also from the documentation:

“Oracle ignores parallel hints on temporary tables. Refer to CREATE TABLE and Oracle Database Concepts for more information on parallel execution.”

As I mentioned at the beginning of this article, I only intend to provide a light-weight exploration of parallel query, so I will stop here… or better yet after providing a couple of potentially helpful Metalink article references that you can review using your shiny new Apple iPad:

• Doc ID 203238.1 “Using Parallel Execution”
• Doc ID 826893.1 “Invalid Hint in 10g Can Cause Other Hints To Be Ignored, Including Parallel Hints”
• Doc ID 199272.1 “Why didn’t my parallel query use the expected number of slaves?”
• Doc ID 233754.1 “Why doesn’t my query run in parallel?”
• Doc ID 267330.1 “SQL statements that run in parallel with NO_PARALLEL hints”
• Doc ID 196938.1 “Why did my query go parallel?”
• Doc ID 752816.1 “Parallel Update On Table Does Not Work If Statistics Have Been Calculated”
• Doc ID 237287.1 “How To Verify Parallel Execution is running”
• Doc ID 457857.1 “Script to monitor parallel queries”
• Doc ID 280939.1 “Checklist for Performance Problems with Parallel Execution”
• Doc ID 46234.1 “Interpreting Explain plan”
• Doc ID 260845.1 “Old and new Syntax for setting Degree of Parallelism”
• Doc ID 235400.1 “Disable Parallel Execution on Session/System Level”
• Doc ID 763419.1 “Neither Table DOP nor parallel hint is saved in the Stored Outline hints”
• Doc ID 6474009.8 “Bug 6474009 – On ASSM tablespaces Scans using Parallel Query can be slower than in 9.2″
• Doc ID 6748058.8 “Bug 6748058 – A parallel query executes serially due to correlation variable”
• Doc ID 6402957.8 “Bug 6402957 – Suboptimal parallel query performance for Top-n rank queries.”
• Doc ID 873392.1 “Ora-04030 with high PGA allocation for ‘kxs-heap-p’ under session heap”
• Doc ID 844538.1 “Wrong Results For ROWNUM During Parallel Execution on RAC”
• Doc ID 755975.1 “Parallel Query With Virtual Private Database Returns Wrong Result or Fails With ORA-600[kzrtgpp - kxfxslavesql]“

Be on the lookout for hard hitting parallel query articles by other authors (such as Doug Burns), and try not to abuse this feature.  I will leave you with this paraphrase from the book “Practical Oracle 8i”

Parallel query’s objective is to maximize resource utilization in order to reduce query execution time.  Oracle may select a high CPU consumption plan using a full tablescan, rather than a lower CPU consuming index range scan – this situation may cause performance problems in an OLTP system.  Parallel query is a data crunching, non-scalable, batch processor.

### 7 responses

17 02 2010

Very Very Clean details. Thanks Charles for all your work and passing knowledge to all.

18 02 2010

Taral,

Thanks for the compliment.

There was a bit of humor embedded in this article, in case anyone missed it. Just to clarify:
* Don’t run SQL statements in parallel(1000) just to drive the cost of the query toward 0. There were a couple of recent comments on this blog pointing back to an asktom article that partially explains why.
* Don’t try viewing Oracle’s support documents on a shiny new Apple iPad – the iPad does not support Adobe Flash, and well, the new Oracle’s support site is Flashy.

Hopefully, this article provides access to a couple of helpful resources, and might even prompt someone else to write something that is well beyond a light-weight exploration of recent parallel query enhancements.

18 02 2010

>and well, the new Oracle’s support site is Flashy.
There’s also http://supporthtml.oracle.com, without flash.

18 02 2010

Thanks for posting the link to the HTML version of the Oracle support site – now if Oracle would just redirect metalink.oracle.com to http://supporthtml.oracle.com I could probably remember how to access the non-Flash version of the website.

18 03 2010

[...] 28-Resource compilation for parallel querying Charles Hooper-Parallel Reasoning [...]

5 06 2011

Hi Charles,

Currently bogged with a issue in parallel query tuning.
Prelude – Server – CMT T5120. 32 Cpu’s. The Batch jobs when transferred from old server runs much slow. CMT servers encourage for parallel processing in batch jobs. So when we parallelized 2 queries, it was running much improved time (nearly 1/2 the time taken). But when we started parallelizing the rest, it degraded the performance of the rest.

SQL> sho parameter parallel_

NAME                                 		TYPE        VALUE
------------------------------------ 		----------- ------------------------------
fast_start_parallel_rollback         	string      LOW
parallel_automatic_tuning            	boolean     FALSE
parallel_execution_message_size      	integer     2152
parallel_instance_group              		string
parallel_max_servers                		 integer     40
parallel_min_percent                		 integer     0
parallel_min_servers                 		integer     0
parallel_server                      		boolean     FALSE
parallel_server_instances            		integer     1
recovery_parallelism                 		integer     0
6 06 2011

Sidh,

I might be stating the obvious here, but parallel query is not a magic fix for all types of performance problems. Consider these two situations:
1. You have 32 CPUs (or cores), your query is not using parallel execution yet it is using 100% of one CPU (or core), the overall CPU utilization average is 50%, and the IO subsystem is not heavily utilized.
2. You have 32 CPUs (or cores), your query is not using parallel execution yet it is using 100% of one CPU (or core), the overall CPU utilization average is 50%, and the IO subsystem is nearly 100% utilized (either maximum number of I/Os per second or maximum throughput).

How might the outcome be different in both of the situations if parallel query is enabled?

It is quite possible that the first two queries that you helped with parallel query were in fact CPU bound (the single CPU utilized was the bottleneck), while the remaining queries were I/O bound (the I/O subsystem was the bottleneck). In either case, it is best not to guess. Instead, capture a 10046 trace at level 8 or 12 for the batch job and determine *why* the other queries execute more slowly when parallelized. You might also capture delta values of V\$OSSTAT periodically during the execution of the batch job so that you can see, for instance, if you now have 200 parallel slaves fighting for the 32 CPUs, or if you now have 200 parallel slaves all waiting for I/O.

You might take a look at a couple of other articles by a couple of OakTable Network members:
http://oracle-randolf.blogspot.com/2011/04/things-worth-to-mention-and-remember.html (a three part series)
http://oracledoug.com/serendipity/index.php?/plugin/tag/Parallel (blog category)
http://structureddata.org/category/oracle/parallel-execution/ (blog category)
http://jonathanlewis.wordpress.com/category/oracle/parallel-execution/ (blog category)