Column Order in a Table – Does it Matter? 2

24 05 2010

May 24, 2010 (Updated May 24, 2010)

(Back to the Previous Post in the Series)

In the previous blog article in this series I created a test case with a table containing 47 columns.  The PL/SQL test precedure experienced 15% to 22% longer execution times when a column with the same definition, just buried 40 columns deeper into the table definition, was referenced by the PL/SQL script.  Could the problem be much worse than 20% – possibly performance problems in the anonymous PL/SQL testing scripts were hiding the true performance impact due to the overhead that did not change from one test to the other (moving the SQL data into a PL/SQL table, for instance).

Today’s blog article provides a demonstration what of happens, performance wise, when very wide tables (having many column) are accessed.  Oracle tables support up to 1,000 columns, however, only 255 columns will fit into a single row piece.  A 1,000 column table, therefore, will use at least 4 row pieces per row inserted into the table.  (I have not yet determined if this is still true if, for instance, the last 800 columns are all NULL, with the rest of the columns sized small enough to fit into a single block).  What performance impact, if any, will appear in the test when accessing a table’s columns that are beyond the first couple of columns?  Will we see the problem identified on page 537 of the Troubleshooting Oracle Performance book?

Creating a test script for a table with 1,000 columns potentially requires a significant amount of typing.  To automate the script writing, an Excel macro compatible (and Visual Basic 6 compatible) script is provided below to help automate the creation of the script:

Dim intFilenum As Integer
Dim strOut As String
Dim i As Integer
Dim j As Integer
Dim k As Integer
Dim intFlag As Integer

intFilenum = FreeFile

Open "C:\ColumnOrder2Script.sql" For Output As #intFilenum

strOut = "CREATE TABLE T10 (" & vbCrLf

For i = 1 To 1000
   strOut = strOut & "  C" & Format(i) & " VARCHAR2(10)," & vbCrLf
Next i

strOut = Left(strOut, Len(strOut) - 3) & ");" & vbCrLf

Print #intFilenum, strOut

Print #intFilenum, "ALTER TABLE T10 CACHE;" & vbCrLf

strOut = "INSERT INTO" & vbCrLf
strOut = strOut & "  T10" & vbCrLf
strOut = strOut & "SELECT" & vbCrLf

For i = 1 To 1000
    intFlag = False
    If i Mod 50 = 0 Then
        intFlag = True
    Else
        Select Case i
            Case 1, 255, 256, 510, 511, 765, 766, 1000
                intFlag = True
        End Select
    End If
    If intFlag = True Then
        strOut = strOut & "  RPAD(CHR(65+MOD(ROWNUM-1,20))||'" & Format(i) & "',10,'A')," & vbCrLf
    Else
        strOut = strOut & "  NULL," & vbCrLf
    End If
Next i

strOut = Left(strOut, Len(strOut) - 3) & vbCrLf
strOut = strOut & "FROM" & vbCrLf
strOut = strOut & "  DUAL" & vbCrLf
strOut = strOut & "CONNECT BY" & vbCrLf
strOut = strOut & "  LEVEL<=1000000;" & vbCrLf

Print #intFilenum, strOut

Print #intFilenum, "COMMIT;" & vbCrLf
Print #intFilenum, "EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T10',CASCADE=>TRUE)"
Print #intFilenum, ""
'Print #intFilenum, "SET AUTOTRACE TRACEONLY STATISTICS"
Print #intFilenum, ""
Print #intFilenum, "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER, LEVEL 1';"
Print #intFilenum, "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'IGNORE';"
Print #intFilenum, "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';" & vbCrLf

For j = 1 To 2
    If j = 2 Then
        Print #intFilenum, "SET TIMING ON"
        Print #intFilenum, "SPOOL COLUMN_ORDER2.TXT"
        Print #intFilenum, "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'COLUMN_ORDER_TEST2';" & vbCrLf
    End If
    For i = 1 To 1000
        intFlag = False
        If i Mod 50 = 0 Then
            intFlag = True
        Else
            Select Case i
                Case 1, 255, 256, 510, 511, 765, 766, 1000
                    intFlag = True
            End Select
        End If
        If intFlag = True Then
            For k = 1 To 10
                If (j = 2) Or (k <= 3) Then
                    'If on the second pass - the one that counts, run SQL 10 times
                    strOut = "SELECT" & vbCrLf
                    strOut = strOut & "  COUNT(C" & Format(i) & ") C" & vbCrLf
                    strOut = strOut & "FROM" & vbCrLf
                    strOut = strOut & "  T10;" & vbCrLf
                    Print #intFilenum, strOut
                End If
            Next k
        End If
    Next i
Next j

Print #intFilenum, "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';"
Print #intFilenum, "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT OFF';"
Print #intFilenum, "SET TIMING OFF"
'Print #intFilenum, "SET AUTOTRACE OFF"

Close #intFilenum

————————————————————–

For those with a Windows client computer that does not have Excel installed, the following script may be saved with a VBS extension and executed directly from Windows to generate the test script.

Dim objFSO
Dim objFile
Dim strOut
Dim i
Dim j
Dim k
Dim intFlag

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.CreateTextFile("C:\ColumnOrder2Script.sql", True)

strOut = "CREATE TABLE T10 (" & vbCrLf

For i = 1 To 1000
   strOut = strOut & "  C" & i & " VARCHAR2(10)," & vbCrLf
Next

strOut = Left(strOut, Len(strOut) - 3) & ");" & vbCrLf

objFile.Write strOut & vbCrLf

objFile.Write "ALTER TABLE T10 CACHE;" & vbCrLf & vbCrLf

strOut = "INSERT INTO" & vbCrLf
strOut = strOut & "  T10" & vbCrLf
strOut = strOut & "SELECT" & vbCrLf

For i = 1 To 1000
    intFlag = False
    If i Mod 50 = 0 Then
        intFlag = True
    Else
        Select Case i
            Case 1, 255, 256, 510, 511, 765, 766, 1000
                intFlag = True
        End Select
    End If
    If intFlag = True Then
        strOut = strOut & "  RPAD(CHR(65+MOD(ROWNUM-1,20))||'" & i & "',10,'A')," & vbCrLf
    Else
        strOut = strOut & "  NULL," & vbCrLf
    End If
Next

strOut = Left(strOut, Len(strOut) - 3) & vbCrLf
strOut = strOut & "FROM" & vbCrLf
strOut = strOut & "  DUAL" & vbCrLf
strOut = strOut & "CONNECT BY" & vbCrLf
strOut = strOut & "  LEVEL<=1000000;" & vbCrLf

objFile.Write strOut & vbCrLf

objFile.Write "COMMIT;" & vbCrLf & vbCrLf
objFile.Write "EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T10',CASCADE=>TRUE)" & vbCrLf
objFile.Write "" & vbCrLf
'objFile.Write "SET AUTOTRACE TRACEONLY STATISTICS" & vbCrLf
objFile.Write "" & vbCrLf
objFile.Write "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT FOREVER, LEVEL 1';" & vbCrLf
objFile.Write "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'IGNORE';" & vbCrLf
objFile.Write "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';" & vbCrLf & vbCrLf

For j = 1 To 2
    If j = 2 Then
        objFile.Write "SET TIMING ON" & vbCrLf
        objFile.Write "SPOOL COLUMN_ORDER2.TXT" & vbCrLf
        objFile.Write "ALTER SESSION SET TRACEFILE_IDENTIFIER = 'COLUMN_ORDER_TEST2';" & vbCrLf & vbCrLf
    End If
    For i = 1 To 1000
        intFlag = False
        If i Mod 50 = 0 Then
            intFlag = True
        Else
            Select Case i
                Case 1, 255, 256, 510, 511, 765, 766, 1000
                    intFlag = True
            End Select
        End If
        If intFlag = True Then
            For k = 1 To 10
                If (j = 2) Or (k <= 3) Then
                    'If on the second pass - the one that counts, run SQL 10 times
                    strOut = "SELECT" & vbCrLf
                    strOut = strOut & "  COUNT(C" & i & ") C" & vbCrLf
                    strOut = strOut & "FROM" & vbCrLf
                    strOut = strOut & "  T10;" & vbCrLf
                    objFile.Write strOut & vbCrLf
                End If
            Next
        End If
    Next
Next

objFile.Write "ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';" & vbCrLf
objFile.Write "ALTER SESSION SET EVENTS '10949 TRACE NAME CONTEXT OFF';" & vbCrLf
objFile.Write "SET TIMING OFF" & vbCrLf
'objFile.Write "SET AUTOTRACE OFF" & vbCrLf

objFile.Close

set objFSO = Nothing
Set objFile = Nothing

ColumnOrder2ScriptCreator.vbs

————————————————————–

The above script might be a little complicated to understand.  What the script does:

  • Create a 1,000 column table with columns defined as VARCHAR2(10) – the columns are named C1, C2, C3, …, C999, C1000.
  • Alters the table to hopefully instruct Oracle to keep the table’s blocks in the buffer cache after those block have been read.
  • Inserts a letter between “A” and “Z” followed by the column position, padded to 10 characters, in every 50th column, as well as the columns that should mark the start column and end column of each row piece (columns 1, 255, 256, 510, 511, 765, 766, 1000).
  • Gathers statistics on the table and any indexes (none in this case).
  • Disables serial direct path read by setting event 10949 (try taking that line out of the script to see how the performance changes).
  • Creates essentially 2 stages of querying the COUNT of every 50th column, in addition to the columns that should represent the end points of each row piece (1, 255, 256, 510, 511, 765, 766, 1000).  The first stage is used to load the blocks into the buffer cache and allow Oracle’s optimizer to settle on the final execution plan (this seems to be necessary on Oracle Database 11.2.0.1), each query is executed three times in this stage.  For the second stage, the trace file identifier (for the 10046 trace) is changed, and each query is listed 10 times to help average the execution time.

Note that the original script included a bug that did not prevent the query output from being suppressed – the SET AUTOTRACE line was incorrectly specified.  I suggest leaving that line commented out in the script generating code, but change the script so that an alias is not assigned to the COUNT() column or assign a unique alias to the column so that the timing that is output on screen may be easily related to the column that is being selected.  The generated script may also be downloaded here: ColumnOrder2Script.sql (strip off the .doc extension and open with a text editor, or run directly with SQL*Plus).

If you add a /*+ Find Me 2 */ comment to each of the SQL statements and change the STATISTICS_LEVEL parameter to ALL, you can use the following to retrieve the actual execution plans and execution statistics:

SET TIMING OFF
SET PAGESIZE 1000
SET LINESIZE 160

SELECT /*+ LEADING(S) */
  T.PLAN_TABLE_OUTPUT
FROM
  (SELECT
    SQL_ID,
    CHILD_NUMBER
  FROM
    V$SQL
  WHERE
    SQL_TEXT LIKE 'SELECT /*+ Find Me 2 */%') S,
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(S.SQL_ID,S.CHILD_NUMBER,'ALLSTATS LAST +COST')) T;

SPOOL OFF

So, what were the results when the script was executed on 64 bit Oracle 11.2.0.1 on the Linux and Windows platforms (with equivalent hardware) using an 8KB block size in an ASSM autoallocate tablespace?

The spool files:

The TKPROF summaries of the 10046 trace files:

In the following summary table, the TKPROF elapsed time totals for the 10 executions of counting each of the non-NULL VARCHAR2(10) columns in the test table is displayed, as well as the elapsed time for that execution divided by the elapsed time for the count of the first column (updated May 24, 2010 to include results from 64 bit Windows running 11.1.0.7 for comparison):

The Factor of C1 column indicates how many times longer the count of that column required compared to the first column.  Counting column 1,000 on Linux required 8.2 times as much time as counting column 1.  On the Windows platform, counting column 1,000 required 7.92 times as much time as counting column C1.  If you review the TKPROF summaries for both platforms you will see an interesting, possibly unexpected, behavior with the way the number of consistent gets increased the further away from the first column our query counted.  The query of column C1 required 1,819,010 consistent gets for 10 executions, while the query of column C1000 required 31,819,010 consistent gets for 10 executions – when you consider that each row required roughly 1300 bytes, that is a significant number of consistent gets for a table whose rows never expanded in size after the initial insert.  The increase in consistent gets is not linear, and neither is the increase in execution time.








Follow

Get every new post delivered to your Inbox.

Join 141 other followers