Explain Plan – Which Plan is Better

29 01 2010

January 29, 2010

A recent post appeared in the OTN forums that indirectly asked the question: which execution plan is better?  The execution plans follow:

The Unhinted Execution Plan:

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT       |             |  1285M|   326G|       |    45M  (1)|178:06:59 |       |       |
|   1 |  LOAD AS SELECT        | E           |       |       |       |            |          |       |       |
|*  2 |   HASH JOIN            |             |  1285M|   326G|  5153M|    45M  (1)|178:06:59 |       |       |
|   3 |    TABLE ACCESS FULL   | D           |   135M|  3607M|       |   254K  (2)| 00:59:17 |       |       |
|*  4 |    HASH JOIN           |             |  1261M|   287G|  2857M|    32M  (1)|124:52:03 |       |       |
|   5 |     TABLE ACCESS FULL  | C           |    76M|  1978M|       |   143K  (2)| 00:33:33 |       |       |
|*  6 |     HASH JOIN          |             |  1241M|   252G|  1727M|    20M  (1)| 78:33:50 |       |       |
|   7 |      TABLE ACCESS FULL | B           |    54M|  1099M|       | 23217   (4)| 00:05:26 |       |       |
|   8 |      PARTITION HASH ALL|             |  1241M|   227G|       |  3452K  (4)| 13:25:29 |     1 |    64 |
|   9 |       TABLE ACCESS FULL| A           |  1241M|   227G|       |  3452K  (4)| 13:25:29 |     1 |    64 |
--------------------------------------------------------------------------------------------------------------

The Hinted Execution Plan that Sets the Cardinality for Table A to 10M Rows:

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT       |             |    10M|  2696M|       |  4578K  (1)| 17:48:26 |       |       |
|   1 |  LOAD AS SELECT        | E           |       |       |       |            |          |       |       |
|*  2 |   HASH JOIN            |             |    10M|  2696M|  2491M|  4578K  (1)| 17:48:26 |       |       |
|*  3 |    HASH JOIN           |             |    10M|  2374M|  2193M|  3996K  (1)| 15:32:36 |       |       |
|*  4 |     HASH JOIN          |             |    10M|  2079M|  1727M|  3636K  (1)| 14:08:30 |       |       |
|   5 |      TABLE ACCESS FULL | B           |    54M|  1099M|       | 23217   (4)| 00:05:26 |       |       |
|   6 |      PARTITION HASH ALL|             |    10M|  1878M|       |  3362K  (1)| 13:04:42 |     1 |    64 |
|   7 |       TABLE ACCESS FULL| A           |    10M|  1878M|       |  3362K  (1)| 13:04:42 |     1 |    64 |
|   8 |     TABLE ACCESS FULL  | C           |    76M|  1978M|       |   143K  (2)| 00:33:33 |       |       |
|   9 |    TABLE ACCESS FULL   | D           |   135M|  3607M|       |   254K  (2)| 00:59:17 |       |       |
--------------------------------------------------------------------------------------------------------------

The Original Poster Stated Both Plans have the Same Predicates:

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(A."ID"="D"."ID")
   3 - access("A"."E_ID"="C"."E_ID")
   4 - access("A"."M_ID"="B"."M_ID")

So, which execution plan is better?  Are the plans the same?  How are they the same, and how do they differ?

While we comtemplate which execution plan is optimal, the OTN thread took a slight detour into a discussion of work areas in Oracle:

“Can you please help understanding workarea.”

A search of the documentation found this page that offered the following definition:

“work area: A private allocation of memory used for sorts, hash joins, and other operations that are memory-intensive. A sort operator uses a work area (the sort area) to perform the in-memory sort of a set of rows. Similarly, a hash-join operator uses a work area (the hash area) to build a hash table from its left input.”

There may be multiple active work areas in a single SQL statement. While not the original purpose of this blog article, the article does show how to see the amount of memory in use for active work areas.  If you have a copy of the book “Troubleshooting Oracle Performance“, I highly recommend that you read pages 434 through 439 if you are curious about Oracle work areas.  Those pages describe how hash joins work and provide a detailed description of work areas.

Back to the original question.  While reading the plans, keep in mind that you are only looking at the optimizer’s estimates for the number of rows, time, memory usage, temp tablespace usage, and costs.  If you are attempting to conclude which plan is faster/better based on the estimates in the first plan and an altered plan with a hinted cardinality estimate, you might be setting yourself up for failure.  Note that the first plan has a calculated cost of about 45,000,000 while the second plan has a calculated cost of about 4,578,000.  So obviously, the second plan is more efficient.  Or is it?  With the cardinality hint, the OP has effectively changed the number of rows that the optimizer expects to be returned from table A from roughly 1,241,000,000 to 10,000,000.  Additionally, one should not directly compare the calcualted cost of one execution plan with that of a second execution plan.  You probably should be thinking to yourself at this point: “Have you considered actually testing the performance?

In the OTN thread Timur stated that both plans use the very same join order: B->A->C->D.  Based on my understanding of execution plans, this is a correct statement, even though the plans look a bit different.  Note that the OP was slightly incorrect in stating that the Predicate Information sections for the two plans were identical – the plan ID numbers should have been a bit different.

(Confession: I re-read the section of the book “Troubleshooting Oracle Performance” that discussed hash joins before providing the following response.)  Essentially, the difference between the two plans is which table (or row source) is the build input, and which table is the probe input. The first table (or row source) listed below the words HASH JOIN is the source for the hash table (the optimizer typically tries to select the smaller estimated row source of the two row sources as the source for the hash table). The second table (or row source) is fully scanned, probing the generated hash table in search of a match. By artifically altering the optimizer’s estimated rows to be returned from table A, the OP has flipped which table (or row source) is the build input, and which table (or row source) is the probe input at each hash join – this could significantly increase, or significantly decrease, or have no impact on the amount of time required for the query to execute, the amount of memory used, or the amount of temp space needed.

My suggestion to the OP is to test the performance to see which execution plan is more efficient, rather than guessing. My blog article that is referenced above has SQL statements that may be used to see the number of work areas that are active at any point, as well as the amount of RAM and temp space in use. You could continue to guess about which plan is better, but why guess?