SQL – Reformatting to Improve Performance 1

4 12 2009

December 4, 2009

(Forward to the Next Post in the Series)

Note: the Oracle cost-based query optimizer is able to automatically transform queries submitted by applications to produce an equivalent SQL statement that will hopefully improve SQL execution performance. The possible transformations are limited by the Oracle release version, the value of the OPTIMIZER_FEATURES_ENABLE parameter (and various other hidden parameters), the potential presence of NULL values, and hints embedded in the SQL statement.

Some time ago the following question appeared in a forum:
http://groups.google.com/group/oracle_dba_experts/browse_thread/thread/9b81c1b5f729813c

There are three tables A B C with 18 Million, 6 M and 8M records respectivly.

select count(1) from
( Select * from A
where
not exists ( select 'x' from B where join on Keys )
and not exists ( select 'x' from C where join on keys )
)

The explain plan shows that it is using all key indexes for all tables and cost is 7366 for table A Index full scan. And other tables have a index Range scan 3 and 4. What could be a reason for this discrepency.

We are using oracle 9i , Linux operating system. For confidentiality I have renamed the tables

Query:

select /*+ parallel */
*
from A    where not exists
    (select /*+ parallel */ 'X'
             from b
            where stud_id = A.stud_id
              and cpnt_typ_id = A.cpnt_typ_id
              and cpnt_id = A.cpnt_id
              and rev_dte = A.rev_dte
              and cmpl_stat_id = A.cmpl_stat_id
              and compl_dte = a.compl_dte)
      and not exists
    (select /*+ parallel */ 'X'
             from c
            where stud_id = A.stud_id
              and cpnt_typ_id =A.cpnt_typ_id
              and cpnt_id = A.cpnt_id
              and rev_dte = A.rev_dte
              and cmpl_stat_id = A.cmpl_stat_id
              and compl_dte = A.compl_dte)

Plan :

SELECT STATEMENT, GOAL = CHOOSE           Cost=7400  Cardinality=    1   Bytes=     62   IO cost=7400
 SORT AGGREGATE                                      Cardinality=    1   Bytes=     62
  FILTER
   INDEX FAST FULL SCAN         PK_A      Cost=7393  Cardinality=46754   Bytes=2898748   IO cost=7393
   INDEX RANGE SCAN             IDX_B_1   Cost=   3  Cardinality=    1   Bytes=     58   IO cost=   3
   TABLE ACCESS BY INDEX ROWID  PA_C      Cost=   4  Cardinality=    1   Bytes=     60   IO cost=   4
    INDEX RANGE SCAN    Object  IX_C      Cost=   3  Cardinality=    1                   IO cost=   3

With the information provided, the best that I was able to offer is the following, with a manually transformed version of the SQL statement from the NOT EXISTS syntax into inline views:

——————————–

The Oracle version in part determines how Oracle will optimize and execute the SQL statement that you submit.  Certain transformations are performed automatically in 9i for NOT EXISTS queries, regardless of whether or not the cost will increase due to the transformation.  The cardinality numbers reported in the explain plan (a DBMS Xplan would be better) do not appear to be consistent with three tables having 18 million, 6 million and 8 million rows.  When collecting statistics, make certain that the DBMS_STATS routine is called with CASCADE=>TRUE specified.

Jonathan Lewis has commented on his blog a couple times about using PARALLEL, which only resulted in even longer execution times.  You might try reformatting the SQL statement into an equivalent form, and removing the PARALLEL hints.  Something like this:

SELECT
  A.*
FROM
  A,
  (SELECT DISTINCT
    STUD_ID,
    CPNT_TYP_ID,
    CPNT_ID,
    REV_DTE,
    CMPL_STAT_ID,
    COMPL_DTE
  FROM
    B) B,
  (SELECT DISTINCT
    STUD_ID,
    CPNT_TYP_ID,
    CPNT_ID,
    REV_DTE,
    CMPL_STAT_ID,
    COMPL_DTE
  FROM
    C) C
WHERE
  A.STUD_ID = B.STUD_ID(+)
  AND A.CPNT_TYP_ID = B.CPNT_TYP_ID(+)
  AND A.CPNT_ID = B.CPNT_ID(+)
  AND A.REV_DTE = B.REV_DTE(+)
  AND A.CMPL_STAT_ID = B.CMPL_STAT_ID(+)
  AND A.COMPL_DTE = B.COMPL_DTE(+)
  AND B.STUD_ID IS NULL
  AND A.STUD_ID = C.STUD_ID(+)
  AND A.CPNT_TYP_ID = C.CPNT_TYP_ID(+)
  AND A.CPNT_ID = C.CPNT_ID(+)
  AND A.REV_DTE = C.REV_DTE(+)
  AND A.CMPL_STAT_ID = C.CMPL_STAT_ID(+)
  AND A.COMPL_DTE = C.COMPL_DTE(+)
  AND C.STUD_ID IS NULL;

Compare the performance of the above with the performance of your original SQL statement using a 10046 trace at level 8.  The above will likely result in 2 sorting operations in 9i, so the SORT_AREA_SIZE parameter may need to be adjusted.

——————————–

Any other ideas?





Transfer a Text File into an Oracle Database using an Excel Macro

2 12 2009

December 2, 2009

The following Excel macro shows how to open a text file as a database using ADO, create a table in Oracle, and then transfer the rows from the text file into the database table using bind variables.  Once the transfer finishes, a worksheet in Excel is populated with the data from the Oracle table.

Private Sub TransferRows()
    'Need to add a reference to Microsoft ActiveX Data Objects 2.8 Library before starting
    Dim i As Integer
    Dim intResult As Integer
    Dim strSQL As String
    Dim strTable As String
    Dim strDBTable As String
    Dim snpData As ADODB.Recordset
    Dim comDataInsert As ADODB.Command

    On Error Resume Next

    strTable = Sheets("ExcelQueryofTextFile").Cells(1, 2).Value
    strTable = Right(strTable, Len(strTable) - InStrRev(strTable, "\"))
    'Set the database table name to the file name without the extension
    strDBTable = UCase(Left(strTable, InStr(strTable, ".") - 1))

    Sheets("ExcelQueryofTextFile").Range("A5:M10006").Delete Shift:=xlUp

    Set snpData = New ADODB.Recordset

    strSQL = "SELECT" & vbCrLf
    strSQL = strSQL & "  [Source IP Address]," & vbCrLf
    strSQL = strSQL & "  [Destination IP Address]," & vbCrLf
    strSQL = strSQL & "  [Time]," & vbCrLf
    strSQL = strSQL & "  [Source Port]," & vbCrLf
    strSQL = strSQL & "  [Destination Port]," & vbCrLf
    strSQL = strSQL & "  [L3 Protocol]," & vbCrLf
    strSQL = strSQL & "  [Application Path]," & vbCrLf
    strSQL = strSQL & "  [Application Description]," & vbCrLf
    strSQL = strSQL & "  [Rule Description]" & vbCrLf
    strSQL = strSQL & "FROM" & vbCrLf
    strSQL = strSQL & "  [" & strTable & "]" & vbCrLf
    strSQL = strSQL & "ORDER BY" & vbCrLf
    strSQL = strSQL & "  [Source IP Address]," & vbCrLf
    strSQL = strSQL & "  [Time]"
    snpData.Open strSQL, dbFile

    If snpData.State = 1 Then
        strSQL = "CREATE TABLE " & strDBTable & "(" & vbCrLf
        strSQL = strSQL & "  SOURCE_IP VARCHAR2(16)," & vbCrLf
        strSQL = strSQL & "  DESTINATION_IP VARCHAR2(16)," & vbCrLf
        strSQL = strSQL & "  ACCESS_TIME DATE," & vbCrLf
        strSQL = strSQL & "  SOURCE_PORT NUMBER(12)," & vbCrLf
        strSQL = strSQL & "  DESTINATION_PORT NUMBER(12)," & vbCrLf
        strSQL = strSQL & "  PROTOCOL VARCHAR2(15)," & vbCrLf
        strSQL = strSQL & "  APPLICATION_PATH VARCHAR2(100)," & vbCrLf
        strSQL = strSQL & "  APPLICATION_DESCRIPTION VARCHAR2(100)," & vbCrLf
        strSQL = strSQL & "  RULE_DESCRIPTION VARCHAR2(30))" & vbCrLf
        dbVMFG.Execute strSQL

        Set comDataInsert = New ADODB.Command
        With comDataInsert
            strSQL = "INSERT INTO " & strDBTable & "(" & vbCrLf
            strSQL = strSQL & "  SOURCE_IP," & vbCrLf
            strSQL = strSQL & "  DESTINATION_IP," & vbCrLf
            strSQL = strSQL & "  ACCESS_TIME," & vbCrLf
            strSQL = strSQL & "  SOURCE_PORT," & vbCrLf
            strSQL = strSQL & "  DESTINATION_PORT," & vbCrLf
            strSQL = strSQL & "  PROTOCOL," & vbCrLf
            strSQL = strSQL & "  APPLICATION_PATH," & vbCrLf
            strSQL = strSQL & "  APPLICATION_DESCRIPTION," & vbCrLf
            strSQL = strSQL & "  RULE_DESCRIPTION)" & vbCrLf
            strSQL = strSQL & "VALUES(" & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?," & vbCrLf
            strSQL = strSQL & "  ?)" & vbCrLf

            .Parameters.Append .CreateParameter("source_ip", adVarChar, adParamInput, 16)
            .Parameters.Append .CreateParameter("destination_ip", adVarChar, adParamInput, 16)
            .Parameters.Append .CreateParameter("access_time", adDate, adParamInput, 8 )
            .Parameters.Append .CreateParameter("source_port", adNumeric, adParamInput, 8 )
            .Parameters.Append .CreateParameter("destination_port", adNumeric, adParamInput, 8 )
            .Parameters.Append .CreateParameter("protocol", adVarChar, adParamInput, 15)
            .Parameters.Append .CreateParameter("application_path", adVarChar, adParamInput, 100)
            .Parameters.Append .CreateParameter("application_description", adVarChar, adParamInput, 100)
            .Parameters.Append .CreateParameter("rule_description", adVarChar, adParamInput, 30)

            'Set up the command properties
            .CommandText = strSQL
            .CommandType = adCmdText
            .CommandTimeout = 30
            .ActiveConnection = dbVMFG
        End With

        If Err = 0 Then
            dbVMFG.BeginTrans
            Do While Not snpData.EOF
                comDataInsert("source_ip") = snpData(0).Value
                comDataInsert("destination_ip") = snpData(1).Value
                comDataInsert("access_time") = CDate(snpData(2).Value)
                comDataInsert("source_port") = Val(snpData(3).Value)
                comDataInsert("destination_port") = Val(snpData(4).Value)
                comDataInsert("protocol") = Left(snpData(5).Value, 15)
                comDataInsert("application_path") = Left(snpData(6).Value, 100)
                comDataInsert("application_description") = Left(snpData(7).Value, 100)
                comDataInsert("rule_description") = Left(snpData(8).Value, 30)
                comDataInsert.Execute

                snpData.MoveNext
            Loop

            snpData.Close
            If Err = 0 Then
                dbVMFG.CommitTrans
            Else
                dbVMFG.RollbackTrans
            End If

            strSQL = "SELECT" & vbCrLf
            strSQL = strSQL & "  SOURCE_IP," & vbCrLf
            strSQL = strSQL & "  DESTINATION_IP," & vbCrLf
            strSQL = strSQL & "  ACCESS_TIME," & vbCrLf
            strSQL = strSQL & "  SOURCE_PORT," & vbCrLf
            strSQL = strSQL & "  DESTINATION_PORT," & vbCrLf
            strSQL = strSQL & "  PROTOCOL," & vbCrLf
            strSQL = strSQL & "  APPLICATION_PATH," & vbCrLf
            strSQL = strSQL & "  APPLICATION_DESCRIPTION," & vbCrLf
            strSQL = strSQL & "  RULE_DESCRIPTION" & vbCrLf
            strSQL = strSQL & "FROM" & vbCrLf
            strSQL = strSQL & "  " & strDBTable & vbCrLf
            strSQL = strSQL & "ORDER BY" & vbCrLf
            strSQL = strSQL & "  SOURCE_IP," & vbCrLf
            strSQL = strSQL & "  ACCESS_TIME"
            snpData.Open strSQL, dbVMFG

            If snpData.State = 1 Then
                'The fast way to place the query results into cells
                For i = 0 To snpData.Fields.Count - 1
                    Sheets("ExcelQueryofTextFile").Cells(5, i + 1).Value = snpData.Fields(i).Name
                Next i
                Sheets("ExcelQueryofTextFile").Range(Sheets("ExcelQueryofTextFile").Cells(5, 1), Sheets("ExcelQueryofTextFile").Cells(5, snpData.Fields.Count)).Font.Bold = True

                Sheets("ExcelQueryofTextFile").Range("A6").CopyFromRecordset snpData

                snpData.Close

            End If
        Else
            intResult = MsgBox("Could not create the table " & strDBTable & " in the database." & vbCrLf & Error(Err), 16, "Excel Demo")
        End If
    End If

    Sheets("ExcelQueryofTextFile").Range("A6").Select
    ActiveWindow.FreezePanes = True

    Set snpData = Nothing
    Set comDataInsert = Nothing
End Sub




Retrieve Data to Excel with a Macro using ADO

2 12 2009

The following Excel macro shows how to connect to an Oracle database using ADO, create a new worksheet in the current work book, and then fill in the returned data using a slow method (not recommended) and a fast method.

'Need to add a reference to Microsoft ActiveX Data Objects 2.8 Library before starting
'*** This section might be in the top portion of the user form, worksheet code, or module:
'Declare a connection object in the general section to hold the connection to the database
Dim dbVMFG As ADODB.Connection
 
'Declare a set of variables to hold the username and password for the database
Dim strUserName As String
Dim strPassword As String
Dim strDatabase As String
'***
 
Dim i as Integer
Dim intResult As Integer
Dim strSQL as String
Dim lngRow as Long
    
On Error Resume Next
 
strDatabase = "MyDB"
 
strUserName = "MyUser"
strPassword = "MyPassword"
 
'Connect to the database
'Oracle connection string
Set dbVMFG = New ADODB.Connection
dbVMFG.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUserName & ";Password=" & strPassword & ";ChunkSize=1000;FetchSize=100;"
 
dbVMFG.ConnectionTimeout = 40
dbVMFG.CursorLocation = adUseClient
dbVMFG.Open
 
If (dbVMFG.State <> 1) Or (Err <> 0) Then
    intResult = MsgBox("Could not connect to the database.  Check your user name and password." & vbCrLf & Error(Err), 16, "Excel Demo")
Else
    strSQL = "SELECT" & VBCrLf
    strSQL = strSQL & "  ID AS RESOURCE_ID," & VBCrLf
    strSQL = strSQL & "  DESCRIPTION" & VBCrLf
    strSQL = strSQL & "FROM" & VBCrLf
    strSQL = strSQL & "  MY_TABLE" & VBCrLf
    strSQL = strSQL & "WHERE" & VBCrLf
    strSQL = strSQL & "  DESCRIPTION LIKE '%10%'" & VBCrLf
    strSQL = strSQL & "ORDER BY" & VBCrLf
    strSQL = strSQL & "  ID"
 
    'Add a new worksheet to the new workbook, add after the last sheet
    ActiveWorkbook.Sheets.Add
    ActiveWorkbook.ActiveSheet.Name = "JustATest"
        
    snpData.Open strSQL, dbVMFG
        
    If snpData.State = 1 Then
        'Slow Method------------------------------------
        lngRow = 0
        'Header Row
        For i = 0 To snpData.Fields.Count – 1
            lngRow = lngRow + 1
            ActiveSheet.Cells(lngRow, i + 1).Value = snpData.Fields(i).Name
            ActiveSheet.Cells(lngRow, i + 1).Font.Bold = True
        Next i
 
        'Detail Rows
        Do While Not snpData.EOF
            lngRow = lngRow + 1
            For i = 0 To snpData.Fields.Count - 1
                ActiveSheet.Cells(lngRow, i + 1).Value = snpData.Fields(i)
            Next i
 
            snpData.MoveNext
        Loop
        'End Slow Method------------------------------------
 
        'Fast Method----------------------------------------
        'Do not use this and the slow method!
        For i = 0 To snpData.Fields.Count - 1
            ActiveSheet.Cells(1, i + 1).Value = snpData.Fields(i).Name
        Next i
        ActiveSheet.Range(ActiveSheet.Cells(1, 1), ActiveSheet.Cells(1, snpData.Fields.Count)).Font.Bold = True
    
        ActiveSheet.Range("A2").CopyFromRecordset snpData
            
        'Auto-fit up to 26 columns
        ActiveSheet.Columns("A:" & Chr(64 + snpData.Fields.Count)).AutoFit
        'End Fast Method----------------------------------------
 
        snpData.Close
    End If
End If




SQL – Methods of Reformatting into Equivalent Forms 5

2 12 2009

December 2, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A couple of years ago someone posted in the comp.databases.oracle.database Usenet group the following challenge:

http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/fbe588a109b3de20

Here’s one of my personal favorites and generally a good interview question too.

Assume two identical tables, one named “A” the other “B” with identical column definitions. Assume that some rows in “A” are duplicated in “B” and some in “B” are duplicated in “A” but each table contains rows unique to that table.

Write a single SELECT statement that will retrieve all of the rows from table “A” that are unique to “A”, all the rows from “B” that are unique to “B” and label each retrieved row with the name of the table in which it was found.

Have fun (and yes there are multiple solutions).

This message thread, like several others, generated suggestions from several people.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

EscVector offered the following solution:

CREATE TABLE A
  (    COL1 CHAR(4),
       COL2 NUMBER,
       COL3 VARCHAR2(10));

 begin
  for x in 1..10
  loop
  insert into a values ('ab'||x, x,'NONUNIQUE');
  end loop;
  end;
  /

 create table B as select * from a;

 begin
  for x in 1..10
  loop
  insert into a values ('a'||x, x,'UNIQUE');
  end loop;
  end;
  /

 begin
  for x in 1..10
  loop
  insert into b values ('b'||x, x,'UNIQUE');
  end loop;
  end;
  /

 commit;

  (select a.col1 ,a.col2 ,a.col3, 'TABA'  from a minus select
b.col1,b.col2,b.col3, 'TABA' from b )
  union
  (select b.col1,b.col2,b.col3 ,'TABB' from b minus select
a.col1,a.col2,a.col3 ,'TABB' from a );

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I offered the following solutions:

CREATE TABLE TABLE_A (
  COL1 VARCHAR2(20),
  COL2 VARCHAR2(20),
  COL3 VARCHAR2(20));

CREATE TABLE TABLE_B (
  COL1 VARCHAR2(20),
  COL2 VARCHAR2(20),
  COL3 VARCHAR2(20));

INSERT INTO TABLE_A VALUES ('TEST1A','TEST1B','TEST1C');
INSERT INTO TABLE_A VALUES ('TEST2A','TEST2B','TEST2C');
INSERT INTO TABLE_A VALUES ('TEST3A','TEST3B','TEST3C');
INSERT INTO TABLE_A VALUES ('TEST4A','TEST4B','TEST4C');
INSERT INTO TABLE_A VALUES ('TEST5A','TEST5B','TEST5C');

INSERT INTO TABLE_B VALUES ('TEST1A','TEST1B','TEST1C');
INSERT INTO TABLE_B VALUES ('TEST2A','TEST1B','TEST1C');
INSERT INTO TABLE_B VALUES ('TEST3A','TEST3B','TEST3C');
INSERT INTO TABLE_B VALUES ('TEST5A','TEST5B','TEST5C');

SELECT DISTINCT
  NVL(A.COL1,B.COL1) COL1,
  NVL(A.COL2,B.COL2) COL2,
  NVL(A.COL3,B.COL3) COL3,
  NVL2(A.COL1,'TABLE A','TABLE B') FROM_TABLE
FROM
  TABLE_A A
FULL OUTER JOIN
  TABLE_B B ON A.COL1=B.COL1 AND A.COL2=B.COL2 AND A.COL3=B.COL3
WHERE
  (A.COL1 IS NULL OR B.COL1 IS NULL)
  OR (A.COL2 IS NULL OR B.COL2 IS NULL)
  OR (A.COL3 IS NULL OR B.COL3 IS NULL);

COL1    COL2    COL3    FROM_TABLE
TEST2A  TEST2B  TEST2C  TABLE A
TEST4A  TEST4B  TEST4C  TABLE A
TEST2A  TEST1B  TEST1C  TABLE B

Extra credit:
SELECT DISTINCT
  NVL(A.COL1,B.COL1) COL1,
  NVL(A.COL2,B.COL2) COL2,
  NVL(A.COL3,B.COL3) COL3,
  NVL2(A.COL1,'TABLE A','TABLE B') FROM_TABLE
FROM
  TABLE_A A
FULL OUTER JOIN
  TABLE_B B ON A.COL1=B.COL1 AND A.COL2=B.COL2 AND A.COL3=B.COL3
WHERE
  UTL_I18N.RAW_TO_CHAR(DBMS_CRYPTO.HASH(UTL_I18N.STRING_TO_RAW(NVL(A.COL1,'1'),
'AL32UTF8'),1)||DBMS_CRYPTO.HASH(UTL_I18N.STRING_TO_RAW(NVL(A.COL2,'1'),
'AL32UTF8'),1)||DBMS_CRYPTO.HASH(UTL_I18N.STRING_TO_RAW(NVL(A.COL3,'1'), 'AL32UTF8'),1),'AL32UTF8')<> UTL_I18N.RAW_TO_CHAR(DBMS_CRYPTO.HASH(UTL_I18N.STRING_TO_RAW(NVL(B.COL1,'1'),
'AL32UTF8'),1)||DBMS_CRYPTO.HASH(UTL_I18N.STRING_TO_RAW(NVL(B.COL2,'1'),
'AL32UTF8'),1)||DBMS_CRYPTO.HASH(UTL_I18N.STRING_TO_RAW(NVL(B.COL3,'1'), 'AL32UTF8'),1),'AL32UTF8');

COL1    COL2    COL3    FROM_TABLE
TEST2A  TEST2B  TEST2C  TABLE A
TEST4A  TEST4B  TEST4C  TABLE A
TEST2A  TEST1B  TEST1C  TABLE B

Here are a couple more solutions, using more than one SELECT in a SQL
statement:
SELECT
  A.COL1,
  A.COL2,
  A.COL3,
  'TABLE A' FROM_TABLE
FROM
  TABLE_A A,
  (SELECT
    A.COL1,
    A.COL2,
    A.COL3
  FROM
    TABLE_A A
  INTERSECT
  SELECT
    B.COL1,
    B.COL2,
    B.COL3
  FROM
    TABLE_B B) M
WHERE
  A.COL1=M.COL1(+)
  AND A.COL2=M.COL2(+)
  AND A.COL3=M.COL3(+)
  AND M.COL1 IS NULL
UNION ALL
SELECT
  B.COL1,
  B.COL2,
  B.COL3,
  'TABLE B' FROM_TABLE
FROM
  TABLE_B B,
  (SELECT
    A.COL1,
    A.COL2,
    A.COL3
  FROM
    TABLE_A A
  INTERSECT
  SELECT
    B.COL1,
    B.COL2,
    B.COL3
  FROM
    TABLE_B B) M
WHERE
  B.COL1=M.COL1(+)
  AND B.COL2=M.COL2(+)
  AND B.COL3=M.COL3(+)
  AND M.COL1 IS NULL;

WITH M AS (
SELECT
  A.COL1,
  A.COL2,
  A.COL3
FROM
  TABLE_A A
INTERSECT
SELECT
  B.COL1,
  B.COL2,
  B.COL3
FROM
  TABLE_B B)
SELECT
  A.COL1,
  A.COL2,
  A.COL3,
  'TABLE A' FROM_TABLE
FROM
  TABLE_A A,
  M
WHERE
  A.COL1=M.COL1(+)
  AND A.COL2=M.COL2(+)
  AND A.COL3=M.COL3(+)
  AND M.COL1 IS NULL
UNION ALL
SELECT
  B.COL1,
  B.COL2,
  B.COL3,
  'TABLE B' FROM_TABLE
FROM
  TABLE_B B,
  M
WHERE
  B.COL1=M.COL1(+)
  AND B.COL2=M.COL2(+)
  AND B.COL3=M.COL3(+)
  AND M.COL1 IS NULL;

SELECT
  A.COL1,
  A.COL2,
  A.COL3,
  'TABLE A' FROM_TABLE
FROM
  TABLE_A A
WHERE
  (A.COL1,A.COL2,A.COL3) NOT IN (
    SELECT DISTINCT
      B.COL1,
      B.COL2,
      B.COL3
    FROM
      TABLE_B B)
UNION ALL
SELECT
  B.COL1,
  B.COL2,
  B.COL3,
  'TABLE B' FROM_TABLE
FROM
  TABLE_B B
WHERE
  (B.COL1,B.COL2,B.COL3) NOT IN (
    SELECT DISTINCT
      A.COL1,
      A.COL2,
      A.COL3
    FROM
      TABLE_A A);

SELECT
  A.COL1,
  A.COL2,
  A.COL3,
  'TABLE A' FROM_TABLE
FROM
  TABLE_A A,
  (SELECT DISTINCT
    B.COL1,
    B.COL2,
    B.COL3
  FROM
    TABLE_B B) B
WHERE
  A.COL1=B.COL1(+)
  AND A.COL2=B.COL2(+)
  AND A.COL3=B.COL3(+)
  AND B.COL3 IS NULL
UNION ALL
SELECT
  B.COL1,
  B.COL2,
  B.COL3,
  'TABLE B' FROM_TABLE
FROM
  TABLE_B B,
  (SELECT DISTINCT
    A.COL1,
    A.COL2,
    A.COL3
  FROM
    TABLE_A A) A
WHERE
  B.COL1=A.COL1(+)
  AND B.COL2=A.COL2(+)
  AND B.COL3=A.COL3(+)
  AND A.COL3 IS NULL;

SELECT
  COL1,
  COL2,
  COL3,
  FROM_TABLE
FROM
(SELECT
  COL1,
  COL2,
  COL3,
  FROM_TABLE,
  COUNT(DISTINCT FROM_TABLE) OVER (PARTITION BY COL1,COL2,COL3)
NUM_TABLES
FROM
(SELECT
  A.COL1,
  A.COL2,
  A.COL3,
  'TABLE A' FROM_TABLE
FROM
  TABLE_A A
UNION ALL
SELECT
  B.COL1,
  B.COL2,
  B.COL3,
  'TABLE B' FROM_TABLE
FROM
  TABLE_B B))
WHERE
  NUM_TABLES=1;




SQL – Methods of Reformatting into Equivalent Forms 4

2 12 2009

December 2, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A couple of years ago someone asked in the comp.databases.oracle.misc Usenet group the following question:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/1ce4e2a5bfa5d86e

I have to limit the number of results without using ROWNUM or something like that. One hint was to use “JOIN”.

I have to select the 10 biggest persons from a table “persons”

id, firstname, lastname, size, age

WITHOUT using ROWNUM or “limitter” like this.

Do you have any idea?

This message thread, like several others, generated suggestions from several people.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I suggested the following:

Method #1:
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  (SELECT
    LAG(ID,1,-1) OVER (ORDER BY HEIGHT DESC) SIGNAL,
    HEIGHT END_HEIGHT,
    LEAD(HEIGHT,9,0) OVER (ORDER BY HEIGHT DESC) START_HEIGHT
  FROM
    T1) R,
  T1
WHERE
  R.SIGNAL=-1
  AND T1.HEIGHT BETWEEN R.START_HEIGHT AND R.END_HEIGHT;

Method #2:
SELECT
  T1.ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  (SELECT
    ID,
    PERCENT_RANK() OVER (ORDER BY T1.HEIGHT DESC) POSITION,
    TR.TOTAL_ROWS
  FROM
    (SELECT
      COUNT(*) TOTAL_ROWS
    FROM
      T1) TR,
    T1) TR,
  T1
WHERE
  TR.POSITION<=(10/TR.TOTAL_ROWS)
  AND TR.ID=T1.ID;

Method #3:
SELECT
  R.ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  (SELECT
    ID,
    ROW_NUMBER() OVER (ORDER BY HEIGHT DESC) POSITION
  FROM
    T1) R,
  T1
WHERE
  R.POSITION<=10
  AND R.ID=T1.ID;
The results of the above methods look something like this:
ID FIRSTNAME  LASTNAME   HEIGHT     AGE
---------------------------------------
 2 SUSAN      SMITH          65      20
 3 DOROTHY    SMITH          62      21
 4 JOHN       SMITH          72      35
 5 DAVID      SMITH          73      34
 7 ROBERT     SMITH          76      45
10 SUSAN      JOHNSON        65.5    20
11 DOROTHY    JOHNSON        62.5    21
12 JOHN       JOHNSON        72.5    35
13 DAVID      JOHNSON        73.5    34
15 ROBERT     JOHNSON        79      45

Method #4:
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  (SELECT
    ID,
    FIRSTNAME,
    LASTNAME,
    HEIGHT,
    AGE,
    (FIRST_VALUE(HEIGHT) OVER (ORDER BY HEIGHT DESC))*HEIGHT MY_PROD,
    MAX_PROD
  FROM
    (SELECT
      MAX(PROD) MAX_PROD
    FROM
      (SELECT
        HEIGHT*LEAD(HEIGHT,9,0) OVER (ORDER BY HEIGHT DESC) PROD
      FROM
        T1)),
    T1)
WHERE
  MY_PROD>=MAX_PROD;

Method #5:
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  T1
MINUS
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  (SELECT
    RANK() OVER (ORDER BY HEIGHT)/COUNT(*) OVER (PARTITION BY 1) ROW_PERCENT,
    (COUNT(*) OVER (PARTITION BY 1)-10)/COUNT(*) OVER (PARTITION BY 1) CUT_OFF,
    ID,
    FIRSTNAME,
    LASTNAME,
    HEIGHT,
    AGE
  FROM
    T1)
WHERE
  ROW_PERCENT<=CUT_OFF;

Method #6:
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  (SELECT
    RANK() OVER (ORDER BY HEIGHT)/COUNT(*) OVER (PARTITION BY 1) ROW_PERCENT,
    (COUNT(*) OVER (PARTITION BY 1)-10)/COUNT(*) OVER (PARTITION BY 1) CUT_OFF,
    ID,
    FIRSTNAME,
    LASTNAME,
    HEIGHT,
    AGE
  FROM
    T1)
WHERE
  ROW_PERCENT>CUT_OFF;

Method #7:
SELECT
  T1.ID,
  T1.FIRSTNAME,
  T1.LASTNAME,
  T1.HEIGHT,
  T1.AGE
FROM
  (SELECT
    MAX(COUNTER) COUNTER
  FROM
    (SELECT
      LEVEL COUNTER
    FROM
      DUAL
    CONNECT BY
      LEVEL<=10)) C,
 (SELECT
    RANK() OVER (ORDER BY HEIGHT DESC) RANKING,
    ID,
    FIRSTNAME,
    LASTNAME,
    HEIGHT,
    AGE
  FROM
    T1) T1
WHERE
  T1.RANKING<=C.COUNTER;

Method #8:
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
 (SELECT
    ID,
    FIRSTNAME,
    LASTNAME,
    HEIGHT,
    AGE
  FROM
    T1
  ORDER BY
    HEIGHT DESC)
WHERE
  ROWNUM<=10;

Method #9:
SELECT
  T1.ID,
  T1.FIRSTNAME,
  T1.LASTNAME,
  T1.HEIGHT,
  T1.AGE
FROM
  (SELECT
    ID,
    FIRSTNAME,
    LASTNAME,
    HEIGHT,
    AGE,
    COUNT(*) OVER (ORDER BY HEIGHT DESC) POSITION
  FROM
    T1) T1,
  (SELECT
    LEVEL COUNTER
  FROM
    DUAL
  CONNECT BY
    LEVEL<=10) C
WHERE
  T1.POSITION=C.COUNTER;

Method #10:
SELECT
  ID,
  FIRSTNAME,
  LASTNAME,
  HEIGHT,
  AGE
FROM
  T1
WHERE
  HEIGHT>
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT<
(SELECT
   MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT<
(SELECT
   MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT <
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT<
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT <
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT<
(SELECT
   MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT <
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT<
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT <
(SELECT
  MAX(HEIGHT)
FROM
  T1
WHERE
  HEIGHT<
(SELECT
   MAX(HEIGHT)
FROM
  T1
)))))))))));

Method #13 (loosely inspired by the above SQL statement):
SELECT
  T1.ID,
  T1.FIRSTNAME,
  T1.LASTNAME,
  T1.HEIGHT,
  T1.AGE
FROM
  T1,
  (SELECT
    T1.HEIGHT,
    COUNT(*) RANKING
  FROM
    T1,
    T1 T2
  WHERE
    T1.HEIGHT<=T2.HEIGHT
  GROUP BY
    T1.HEIGHT
  HAVING
    COUNT(*) BETWEEN 1 AND 10) T2
WHERE
  T1.HEIGHT=T2.HEIGHT;

Method #14 (a slight modification of Method #13):
SELECT
  T1.ID,
  T1.FIRSTNAME,
  T1.LASTNAME,
  T1.HEIGHT,
  T1.AGE
FROM
  T1,
  (SELECT
    T1.HEIGHT,
    COUNT(*) OVER (PARTITION BY 1) TOTAL_COUNT,
    COUNT(*) RANKING
  FROM
    T1,
    T1 T2
  WHERE
    T1.HEIGHT>T2.HEIGHT
  GROUP BY
    T1.HEIGHT) T2
WHERE
  (T2.TOTAL_COUNT-T2.RANKING+1) BETWEEN 1 AND 10
  AND T2.HEIGHT=T1.HEIGHT;

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 Dieter Noeth suggested the following:

SELECT t1.*
FROM t1 JOIN t1 AS t2 ON t1.HEIGHT <= t2.HEIGHT
GROUP BY t1.ID, t1.FIRSTNAME,t1.LASTNAME, t1.HEIGHT, t1.AGE
HAVING COUNT(*) <= 10

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Gerard H. Pille suggested the following:

select * from T1
 where height >= (
    select MIN(height) from T1 c1
      where 10 > (select count(*) from T1 c2 where c2.height >
c1.height)
    );




SQL – Methods of Reformatting into Equivalent Forms 3

2 12 2009

December 2, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A couple of years ago someone asked in the comp.databases.oracle.misc Usenet group the following question:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/3572ee04308f6a27

How can I get a specific character count in a string (i.e : string is 56222, and I am looking for ‘2’ occurance when i do :
select charcount(‘56222′) should return : 3 )

This message thread, like several others, generated suggestions from several people.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Anurag Varma suggested the following:

ORA92> select length('56222') - length(replace('56222','2')) from dual;

LENGTH('56222')-LENGTH(REPLACE
------------------------------
                             3

10GR2> select length(regexp_replace('56222','[^2]','')) from dual;

LENGTH(REGEXP_REPLACE('56222','[^2]',''))
-----------------------------------------
                                        3

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

DA Morgan suggested the following:

SELECT LENGTH(TRANSLATE('56222', '2013456789', '2')) FROM dual;

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I then offered the following:

SELECT
  SUM(SIGN(INSTR('562225622256222','2',1,ROWNUM)))
FROM
  DUAL
CONNECT BY
  LEVEL<20;

SUM(SIGN(INSTR('562225622256222','2',1,ROWNUM)))
-------------
9

SELECT
  SUM(DECODE(SUBSTR('562225622256222',ROWNUM,1),'2',1,0))
FROM
  DUAL
CONNECT BY
  LEVEL<255;

SUM(DECODE(SUBSTR('562225622256222',ROWNUM,1),'2',1,0))
------------------
9

SELECT
  SUM(
    CASE WHEN INSTR('562225622256222','2',1,ROWNUM)>0 THEN
      1
    ELSE
      0
    END
    )
FROM
  DUAL
CONNECT BY
  LEVEL<20;

SELECT
  COUNT(
    CASE WHEN INSTR('562225622256222','2',1,ROWNUM)>0 THEN
      1
    ELSE
      NULL
    END
    )
FROM
  DUAL
CONNECT BY
  LEVEL<20;

SUM(CASEWHENINSTR('562225622256222','2',1,ROWNUM)>0THEN1ELSE0END)
9

COUNT(CASEWHENINSTR('562225622256222','2',1,ROWNUM)>0THEN1ELSENULLEND)
9

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko offered the following to test the performance of each method:

SQL> declare
   2  s number;
   3  c number;
   4  begin
   5
   6          s:=dbms_utility.get_time;
   7          for i in 1..100000 loop
   8          SELECT SUM(DECODE(SUBSTR('562225622256222',ROWNUM,1),'2',1,0))
   9          into c
  10          FROM DUAL CONNECT BY LEVEL<=15;
  11          end loop;
  12          s := dbms_utility.get_time -s;
  13          dbms_output.put_line('SUBSTR/DECODE/CONNECT BY time: '||trunc(s/100));
  14
  15          s:=dbms_utility.get_time;
  16          for i in 1..100000 loop
  17          SELECT  SUM(SIGN(INSTR('562225622256222','2',1,ROWNUM)))
  18          into c
  19          FROM DUAL CONNECT BY LEVEL<=15;
  20          end loop;
  21          s := dbms_utility.get_time -s;
  22          dbms_output.put_line('SIGN/INSTR/CONNECT BY time: '|| trunc(s/100));
  23
  24          s:=dbms_utility.get_time;
  25          for i in 1..100000 loop
  26          select length('562225622256222') - length(replace('562225622256222','2'))
  27          into c
  28          from dual;
  29          end loop;
  30          s := dbms_utility.get_time -s;
  31          dbms_output.put_line('LENGTH/REPLACE time: '|| trunc(s/100));
  32
  33          s:=dbms_utility.get_time;
  34          for i in 1..100000 loop
  35          select length(regexp_replace('562225622256222','[^2]',''))
  36          into c
  37          from dual;
  38          end loop;
  39          s := dbms_utility.get_time -s;
  40          dbms_output.put_line('REGEXP_REPLACE time: '|| trunc(s/100));
  41  end;
  42  /
SUBSTR/DECODE/CONNECT BY time: 17
SIGN/INSTR/CONNECT BY time: 18
LENGTH/REPLACE time: 13
REGEXP_REPLACE time: 13




SQL – Methods of Reformatting into Equivalent Forms 2

2 12 2009

December 2, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A couple of years ago someone asked in the comp.databases.oracle.misc Usenet group the following question:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/576ea61b1a93469b

I need to write a query that returns only the students that have read all books by an author.  I have these tables set up so far….

create table Books (
        BookTitle varchar2(20) PRIMARY KEY,
        author varchar2(20)
);

create table BookCamp (
        MemberName varchar2(20),
        BookTitle varchar2(20),
        CONSTRAINT fk_BookTitle
          FOREIGN KEY (BookTitle)
          REFERENCES Books(BookTitle)
);

insert into Books values ('Psycho', 'Brian');
insert into Books values ('Happy Rotter', 'Rocksteady');
insert into Books values ('Goblet', 'J.K Rowling');
insert into Books values ('Prisoner', 'J.K Rowling');
insert into BookCamp values ('Bob', 'Psycho');
insert into BookCamp values ('Chuck', 'Goblet');
insert into BookCamp values ('Chuck', 'Prisoner');
insert into BookCamp values ('Mike', 'Psycho');
insert into BookCamp values ('Mike', 'Goblet');
insert into BookCamp values ('Mike', 'Prisoner');
insert into BookCamp values ('Mary', 'Goblet');

So basically, if I inputted “J.K Rowling” the names “Chuck” and “Mike” should come up.  If the author is “Brian” then the names “Bob” and “Mike” should come up.  I’ve tried several things like… select membername from BookCamp where BookTitle in(select BookTitle from Books where (author = ‘J.K Rowling’)); but this obviously isn’t quite there….Any Help?

This message thread, like several others, generated suggestions from several people.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko suggested the following:

SQL> SELECT DISTINCT membername
   2  FROM bookcamp bc
   3  WHERE NOT EXISTS (
   4    SELECT NULL
   5    FROM bookcamp bc1
   6    PARTITION BY (membername)
   7    RIGHT OUTER JOIN books b
   8    ON (bc1.booktitle=b.booktitle)
   9    WHERE b.author='J.K Rowling'
  10    AND bc.membername=bc1.membername AND bc1.booktitle IS NULL)
  11  /

MEMBERNAME
--------------------
Chuck
Mike

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I then offered the following:

Maxim has provided a solution that produces the desired list.  Let’s see if we can develop another method to solve this problem.  First, a simple experiment using the analytical version of COUNT:

SELECT
  B.BOOKTITLE,
  B.AUTHOR,
  COUNT(B.BOOKTITLE) OVER (PARTITION BY B.AUTHOR) NUM_BOOKS
FROM
  BOOKS B
WHERE
  B.AUTHOR='J.K Rowling';

BOOKTITLE            AUTHOR                NUM_BOOKS
-------------------- -------------------- ----------
Prisoner             J.K Rowling                   2
Goblet               J.K Rowling                   2

Not too impressive yet, but let’s add in the second table:

SELECT
  BC.MEMBERNAME,
  B.BOOKTITLE,
  B.AUTHOR,
  COUNT(DISTINCT B.BOOKTITLE) OVER (PARTITION BY B.AUTHOR) AUTHOR_NUM_BOOKS,
  COUNT(BC.BOOKTITLE) OVER (PARTITION BY BC.MEMBERNAME, B.AUTHOR) MEMBER_NUM_BOOKS
FROM
  BOOKS B,
  BOOKCAMP BC
WHERE
  B.AUTHOR='J.K Rowling'
  AND B.BOOKTITLE=BC.BOOKTITLE(+);

Note that I changed the original COUNT(B.BOOKTITLE) to COUNT(DISTINCT B.BOOKTITLE) and changed the alias to AUTHOR_NUM_BOOKS:

MEMBERNAME           AUTHOR_NUM_BOOKS MEMBER_NUM_BOOKS
-------------------- ---------------- ----------------
Chuck                               2                2
Mary                                2                1
Mike                                2                2
Mike                                2                2
Chuck                               2                2

Now, we need a way to first eliminate all rows where AUTHOR_NUM_BOOKS is not equal to MEMBER_NUM_BOOKS, and then return a list of names without duplicates.  This can be accomplished by sliding the above SQL statement into an inline view:

SELECT DISTINCT
  MEMBERNAME
FROM
  (SELECT
    BC.MEMBERNAME,
    COUNT(DISTINCT B.BOOKTITLE) OVER (PARTITION BY B.AUTHOR)
AUTHOR_NUM_BOOKS,
    COUNT(BC.BOOKTITLE) OVER (PARTITION BY BC.MEMBERNAME, B.AUTHOR)
MEMBER_NUM_BOOKS
  FROM
    BOOKS B,
    BOOKCAMP BC
  WHERE
    B.AUTHOR='J.K Rowling'
    AND B.BOOKTITLE=BC.BOOKTITLE(+))
WHERE
  AUTHOR_NUM_BOOKS=MEMBER_NUM_BOOKS;

MEMBERNAME
--------------------
Chuck
Mike

Let’s try again, this time without analytical functions.  First, let’s find out how many of the author’s books were read by each membername:

SELECT
  BC.MEMBERNAME,
  B.AUTHOR,
  COUNT(*) MEMBER_NUM_BOOKS
FROM
  BOOKS B,
  BOOKCAMP BC
WHERE
  B.AUTHOR='J.K Rowling'
  AND BC.BOOKTITLE=B.BOOKTITLE
GROUP BY
  BC.MEMBERNAME,
  B.AUTHOR;

MEMBERNAME           AUTHOR               MEMBER_NUM_BOOKS
-------------------- -------------------- ----------------
Mike                 J.K Rowling                         2
Chuck                J.K Rowling                         2
Mary                 J.K Rowling                         1

Now, let’s determine the number of books written by each author:

SELECT
  AUTHOR,
  COUNT(*) AUTHOR_NUM_BOOKS
FROM
  BOOKS
GROUP BY
  AUTHOR;

AUTHOR               AUTHOR_NUM_BOOKS
-------------------- ----------------
Rocksteady                          1
Brian                               1
J.K Rowling                         2

Let’s put each into an inline view and pull out the membernames of interest:

SELECT DISTINCT
  BC.MEMBERNAME
FROM
  (SELECT
    BC.MEMBERNAME,
    B.AUTHOR,
    COUNT(*) MEMBER_NUM_BOOKS
  FROM
    BOOKS B,
    BOOKCAMP BC
  WHERE
    B.AUTHOR='J.K Rowling'
    AND BC.BOOKTITLE=B.BOOKTITLE
  GROUP BY
    BC.MEMBERNAME,
    B.AUTHOR) BC,
  (SELECT
    AUTHOR,
    COUNT(*) AUTHOR_NUM_BOOKS
  FROM
    BOOKS
  GROUP BY
    AUTHOR) B
WHERE
  B.AUTHOR=BC.AUTHOR
  AND B.AUTHOR_NUM_BOOKS=BC.MEMBER_NUM_BOOKS;

MEMBERNAME
--------------------
Chuck
Mike

Let’s try one more time, A simple starting point:

SELECT
  BC.MEMBERNAME,
  COUNT(BC.BOOKTITLE)
FROM
  BOOKS B,
  BOOKCAMP BC
WHERE
  B.AUTHOR='J.K Rowling'
  AND B.BOOKTITLE=BC.BOOKTITLE(+)
GROUP BY
  BC.MEMBERNAME;

MEMBERNAME           COUNT(BC.BOOKTITLE)
-------------------- -------------------
Chuck                                  2
Mary                                   1
Mike                                   2

[The above does not need to be an outer join]
Now, let’s add an inline view to retrieve the total number of books written by the author:

SELECT
  BC.MEMBERNAME,
  COUNT(BC.BOOKTITLE) MEMBER_NUM_BOOKS,
  NB.AUTHOR_NUM_BOOKS
FROM
  BOOKS B,
  BOOKCAMP BC,
  (SELECT
    COUNT(*) AUTHOR_NUM_BOOKS
  FROM
    BOOKS
  WHERE
    AUTHOR='J.K Rowling') NB
WHERE
  B.AUTHOR='J.K Rowling'
  AND B.BOOKTITLE=BC.BOOKTITLE(+)
GROUP BY
  BC.MEMBERNAME,
  NB.AUTHOR_NUM_BOOKS;

MEMBERNAME           MEMBER_NUM_BOOKS AUTHOR_NUM_BOOKS
-------------------- ---------------- ----------------
Chuck                               2                2
Mike                                2                2
Mary                                1                2

The final clean up is accomplished with a HAVING clause:

 SELECT
  BC.MEMBERNAME
FROM
  BOOKS B,
  BOOKCAMP BC,
  (SELECT
    COUNT(*) AUTHOR_NUM_BOOKS
  FROM
    BOOKS
  WHERE
    AUTHOR='J.K Rowling') NB
WHERE
  B.AUTHOR='J.K Rowling'
  AND B.BOOKTITLE=BC.BOOKTITLE(+)
GROUP BY
  BC.MEMBERNAME,
  NB.AUTHOR_NUM_BOOKS
HAVING
  COUNT(BC.BOOKTITLE)=NB.AUTHOR_NUM_BOOKS;

MEMBERNAME
--------------------
Chuck
Mike

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko then suggested the following:

Just for fun, yet another one:

SELECT MEMBERNAME
FROM   (SELECT B.MEMBERNAME,CAST(COLLECT(booktitle) AS
SYS.dbms_debug_vc2coll) BOOKLIST
         FROM   BOOKCAMP B
         GROUP  BY MEMBERNAME) M,
        (SELECT AUTHOR,CAST(COLLECT(booktitle) AS
SYS.dbms_debug_vc2coll) BOOKLIST
         FROM   BOOKS B
         GROUP  BY AUTHOR) A
WHERE  A.BOOKLIST SUBMULTISET OF M.BOOKLIST
AND    AUTHOR = 'J.K Rowling'
/

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I then offered the following:

Let’s see if there is another way – caution, this might be inefficient:
The starting point:

SELECT
  B.AUTHOR,
  B.BOOKTITLE,
  ROW_NUMBER() OVER (PARTITION BY B.AUTHOR ORDER BY B.BOOKTITLE)
BOOK_NUM,
  COUNT(B.BOOKTITLE) OVER (PARTITION BY B.AUTHOR) BOOK_COUNT
FROM
  BOOKS B
WHERE
  AUTHOR='J.K Rowling';

AUTHOR               BOOKTITLE              BOOK_NUM BOOK_COUNT
-------------------- -------------------- ---------- ----------
J.K Rowling          Goblet                        1          2
J.K Rowling          Prisoner                      2          2

Now, let’s put the book list into a comma separated list:

SELECT
  SUBSTR(SYS_CONNECT_BY_PATH(BOOKTITLE,','),2) BOOK_LIST
FROM
  (SELECT
    B.AUTHOR,
    B.BOOKTITLE,
    ROW_NUMBER() OVER (PARTITION BY B.AUTHOR ORDER BY B.BOOKTITLE) BOOK_NUM,
    COUNT(B.BOOKTITLE) OVER (PARTITION BY B.AUTHOR) BOOK_COUNT
  FROM
    BOOKS B
  WHERE
    B.AUTHOR='J.K Rowling')
WHERE
  BOOK_NUM=BOOK_COUNT
CONNECT BY PRIOR
  BOOK_NUM=BOOK_NUM-1
START WITH
  BOOK_NUM=1;

BOOK_LIST
---------------
Goblet,Prisoner

We are now half way done.  Prepare to do the same with the BOOKCAMP
table:

SELECT
  BC.MEMBERNAME,
  BC.BOOKTITLE,
  ROW_NUMBER() OVER (PARTITION BY BC.MEMBERNAME,B.AUTHOR ORDER BY B.BOOKTITLE) BOOK_NUM,
  COUNT(B.BOOKTITLE) OVER (PARTITION BY BC.MEMBERNAME,B.AUTHOR) BOOK_COUNT
FROM
  BOOKS B,
  BOOKCAMP BC
WHERE
  B.AUTHOR='J.K Rowling'
  AND B.BOOKTITLE=BC.BOOKTITLE;

MEMBERNAME           BOOKTITLE              BOOK_NUM BOOK_COUNT
-------------------- -------------------- ---------- ----------
Chuck                Goblet                        1          2
Chuck                Prisoner                      2          2
Mary                 Goblet                        1          1
Mike                 Goblet                        1          2
Mike                 Prisoner                      2          2

Generate a comma separated list for each MEMBERNAME:

SELECT
  MEMBERNAME,
  SUBSTR(SYS_CONNECT_BY_PATH(BOOKTITLE,','),2) BOOK_LIST
FROM
  (SELECT
    BC.MEMBERNAME,
    BC.BOOKTITLE,
    ROW_NUMBER() OVER (PARTITION BY BC.MEMBERNAME,B.AUTHOR ORDER BY B.BOOKTITLE) BOOK_NUM,
    COUNT(B.BOOKTITLE) OVER (PARTITION BY BC.MEMBERNAME,B.AUTHOR) BOOK_COUNT
  FROM
    BOOKS B,
    BOOKCAMP BC
  WHERE
    B.AUTHOR='J.K Rowling'
    AND B.BOOKTITLE=BC.BOOKTITLE)
WHERE
  BOOK_NUM=BOOK_COUNT
CONNECT BY PRIOR
  (MEMBERNAME||TO_CHAR(BOOK_NUM))=(MEMBERNAME||TO_CHAR(BOOK_NUM-1))
START WITH
  BOOK_NUM=1;

MEMBERNAME  BOOK_LIST
--------------------
Chuck       Goblet,Prisoner
Mary        Goblet
Mike        Goblet,Prisoner

Now, let’s put it all together to see where the author book list matches the MEMBERNAME book lists:

SELECT
  BC.MEMBERNAME
FROM
(SELECT
  SUBSTR(SYS_CONNECT_BY_PATH(BOOKTITLE,','),2) BOOK_LIST
FROM
  (SELECT
    B.AUTHOR,
    B.BOOKTITLE,
    ROW_NUMBER() OVER (PARTITION BY B.AUTHOR ORDER BY B.BOOKTITLE) BOOK_NUM,
    COUNT(B.BOOKTITLE) OVER (PARTITION BY B.AUTHOR) BOOK_COUNT
  FROM
    BOOKS B
  WHERE
    B.AUTHOR='J.K Rowling')
WHERE
  BOOK_NUM=BOOK_COUNT
CONNECT BY PRIOR
  BOOK_NUM=BOOK_NUM-1
START WITH
  BOOK_NUM=1) B,
(SELECT
  MEMBERNAME,
  SUBSTR(SYS_CONNECT_BY_PATH(BOOKTITLE,','),2) BOOK_LIST
FROM
  (SELECT
    BC.MEMBERNAME,
    BC.BOOKTITLE,
    ROW_NUMBER() OVER (PARTITION BY BC.MEMBERNAME,B.AUTHOR ORDER BY B.BOOKTITLE) BOOK_NUM,
    COUNT(B.BOOKTITLE) OVER (PARTITION BY BC.MEMBERNAME,B.AUTHOR) BOOK_COUNT
  FROM
    BOOKS B,
    BOOKCAMP BC
  WHERE
    B.AUTHOR='J.K Rowling'
    AND B.BOOKTITLE=BC.BOOKTITLE)
WHERE
  BOOK_NUM=BOOK_COUNT
CONNECT BY PRIOR
  (MEMBERNAME||TO_CHAR(BOOK_NUM))=(MEMBERNAME||TO_CHAR(BOOK_NUM-1))
START WITH
  BOOK_NUM=1) BC
WHERE
  B.BOOK_LIST=BC.BOOK_LIST;

MEMBERNAME
--------------------
Chuck
Mike

Oddly, the above executes much faster than the CAST(COLLECT(booktitle) AS SYS.dbms_debug_vc2coll) solution.  Maybe the dataset size should be increased, and the OP should post the performance results of each method to see how the first two solutions compare with the others.  I think that it would be interesting to see if the CAST(COLLECT(booktitle) AS SYS.dbms_debug_vc2coll) method scales better than the other methods.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Maxim Demenko then suggested the following:

This ties to the problem of set comparisons in sql, which i believe ( i don’t mean multiset operations) can’t be effectively solved in pure sql.
Yet one approach (borrowed from
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1…)

SELECT DISTINCT MEMBERNAME
FROM   (SELECT BC.*,
                B.*,
                SUM(B_RNK) OVER(PARTITION BY MEMBERNAME, AUTHOR) M_RNK
         FROM   BOOKCAMP BC,
                (SELECT B.*, SUM(B_RNK) OVER(PARTITION BY AUTHOR) A_RNK
                 FROM   (SELECT B.*,
                                POWER(2,
                                      DENSE_RANK() OVER(ORDER BY BOOKTITLE) - 1) B_RNK
                         FROM   BOOKS B) B) B
         WHERE  BC.BOOKTITLE = B.BOOKTITLE)
WHERE  AUTHOR = 'J.K Rowling'
AND    A_RNK = M_RNK
/

however, it’ll have its limitations too ( and on really big sets – bigger than 1000 members) – i think, all suggested solutions will not perform very well. For middle sized sets ( where the complete resultsets will fit into pga) – the best performance i saw until now ( for similar tasks) – has the model clause.





SQL – Methods of Reformatting into Equivalent Forms 1

2 12 2009

December 2, 2009

(Forward to the Next Post in the Series)

A couple of years ago someone asked in the comp.databases.oracle.misc Usenet group the following question:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/24827610d4e40a0d

I’ve tried SQL to come up with the result below but am having no luck. Please, can someone help me!!!!!:)  Believe me, any and all help will be greatly appreciated.

What I need is the task_ids that have actv_code.short_name like ‘%FIN’ where actv_code.code = ‘Lead Craft’ and no actv_code.code that equals Outage Code

 
 

task table
task_id  task_code
  1         W123456
  2         07146566
  3         07146567
  4         06230001
  5         06123321
  6         06496334
  7         W642121
  8         05462111

actv_code table
task_id  code                 short_name
   1       Outage Code         R16
   4       Outage Code         R15
   6       Outage Code         R16
   1       Lead Craft          ZFM
   5       Lead Craft          EFIN
   6       Lead Craft          MFIN
   7       Lead Craft          IFIN
   8       Outage Code         R16
   8       Lead Craft          MFIN

Result Set
task_id  task_code
   5        06123321
   7        W642121 

 

This message thread, like several others, generated suggestions from several people. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Michel Cadot suggested the following:

Just write it as you explain it:

select task_id
from actv_code a
where short_name like '%FIN'
  and code = 'Lead Craft'
  and not exists (select null from actv_code b where b.task_id = a.task_id and b.code = 'Outage Code')
/

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I then offered the following:

Let’s start out with the table definitions and insert statements:

CREATE TABLE TASK(
  TASK_ID NUMBER(10),
  TASK_CODE VARCHAR2(10));
CREATE TABLE ACTV_CODE(
  TASK_ID NUMBER(10),
  CODE VARCHAR2(20),
  SHORT_NAME VARCHAR2(10));
INSERT INTO TASK VALUES (1,'W123456');
INSERT INTO TASK VALUES (2,'07146566');
INSERT INTO TASK VALUES (3,'07146567');
INSERT INTO TASK VALUES (4,'06230001');
INSERT INTO TASK VALUES (5,'06123321');
INSERT INTO TASK VALUES (6,'06496334');
INSERT INTO TASK VALUES (7,'W642121');
INSERT INTO TASK VALUES (8,'05462111');
COMMIT;

INSERT INTO ACTV_CODE VALUES (1,'Outage Code','R16');
INSERT INTO ACTV_CODE VALUES (4,'Outage Code','R15');
INSERT INTO ACTV_CODE VALUES (6,'Outage Code','R16');
INSERT INTO ACTV_CODE VALUES (1,'Lead Craft','ZFM');
INSERT INTO ACTV_CODE VALUES (5,'Lead Craft','EFIN');
INSERT INTO ACTV_CODE VALUES (6,'Lead Craft','MFIN');
INSERT INTO ACTV_CODE VALUES (7,'Lead Craft','IFIN');
INSERT INTO ACTV_CODE VALUES (8,'Outage Code','R16');
INSERT INTO ACTV_CODE VALUES (8,'Lead Craft','MFIN');
COMMIT;

Let’s determine the list TASK_IDs that you definitely do not want:

SELECT DISTINCT
  TASK_ID
FROM
  ACTV_CODE
WHERE
  CODE='Outage Code';

We can use that list to exclude specific rows from the query results like this:

SELECT
  T.TASK_ID,
  T.TASK_CODE
FROM
  TASK T,
  ACTV_CODE AC,
  (SELECT DISTINCT
    TASK_ID
  FROM
    ACTV_CODE
  WHERE
    CODE='Outage Code') ACN
WHERE
  T.TASK_ID=ACN.TASK_ID(+)
  AND ACN.TASK_ID IS NULL
  AND T.TASK_ID=AC.TASK_ID
  AND AC.SHORT_NAME LIKE '%FIN';

We basically created an outer join between the list of records and the list of records that we do not want (T.TASK_ID=ACN.TASK_ID(+)), and then specified that the record should not be in those records that we do not want (AND ACN.TASK_ID IS NULL).

   TASK_ID TASK_CODE
========== ==========
         7 W642121
         5 06123321

You may need to make minor adjustments to the above SQL statement.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Chris L. offered the following:

SELECT * FROM task WHERE task_id IN
(
  SELECT task_id FROM actv_code WHERE short_name LIKE '%FIN'
  AND code='Lead Craft'
  MINUS
  SELECT task_id FROM actv_code WHERE code='Outage Code'
);

Though I’d do something about that “short_name ends with FIN” filter,
and I’d try and generate a table with codes (Lead Craft, Outage Code,
etc) and use the ID’s not the descriptions.





SQL PLAN_HASH_VALUE Changes for the Same SQL Statement

1 12 2009

December 1, 2009

In this OTN thread:
http://forums.oracle.com/forums/thread.jspa?messageID=3697025

A user asked the following question:

How would you explain the following

select snap_id,sql_id,plan_hash_value from
 dba_hist_sqlstat where sql_id='cmrh5mk9854vs'

SNAP_ID SQL_ID        PLAN_HASH_VALUE
13487   cmrh5mk9854vs 3314988613
14471   cmrh5mk9854vs 3452699090

 Two different hash values for the same same sqls

I believe that your real question is why does that happen?  There are several reasons for the change in plan, here is a demonstration on Oracle 10.2.0.2:

CREATE TABLE T1 (
  C1 VARCHAR2(20),
  C2 VARCHAR2(255),
  PRIMARY KEY (C1));

INSERT INTO T1
SELECT
  TO_CHAR(ROWNUM),
  LPAD(CHR(65 + ROUND(ROWNUM/1000)),255,CHR(65 + ROUND(ROWNUM/1000)))
FROM
  DUAL
CONNECT BY
  LEVEL<=1000000;

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE)

The table created by the above contains 1,000,000 rows with an index on column C1. Column C1 is defined as a VARCHAR2, but I placed numbers into the column. Now for the first part of the test, where I request that Oracle display the execution plan for a SQL statement without actually executing the SQL statement:

SET AUTOTRACE TRACEONLY EXPLAIN

VARIABLE N1 NUMBER
VARIABLE N2 NUMBER

SELECT
  C1,
  C2
FROM
  T1
WHERE
  C1 BETWEEN :N1 AND :N2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3390613643
---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |  2500 |   639K|  1001   (1)| 00:00:05 |
|*  1 |  FILTER                      |              |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1           |  2500 |   639K|  1001   (1)| 00:00:05 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0020503 |  4500 |       |    20   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:N1<=:N2)
   3 - access("C1">=:N1 AND "C1"<=:N2)

VARIABLE N1 VARCHAR2(20)
VARIABLE N2 VARCHAR2(20)

EXEC :N1 := '1'
EXEC :N2 := '9999'

SELECT
  C1,
  C2
FROM
  T1
WHERE
  C1 BETWEEN :N1 AND :N2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3390613643

---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |  2500 |   639K|  1001   (1)| 00:00:05 |
|*  1 |  FILTER                      |              |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1           |  2500 |   639K|  1001   (1)| 00:00:05 |
|*  3 |    INDEX RANGE SCAN          | SYS_C0020503 |  4500 |       |    20   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:N1<=:N2)
   3 - access("C1">=:N1 AND "C1"<=:N2)

The first execution plan displayed is incorrect. I have a utility (my Toy Project for Performance Tuning) which retrieves all plans for a SQL_ID using DBMS_XPLAN and also checks V$SQL_SHARED_CURSOR and V$SQL_BIND_METADATA. This is the output:

SQL_ID  cguuvrry2hx1z, child number 0
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   C1 BETWEEN :N1 AND :N2

Plan hash value: 3332582666

------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |  1970 (100)|          |        |
|*  1 |  FILTER            |      |       |       |            |          |        |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 |   639K|  1970   (5)| 00:00:10 |    CHK |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:N1<=:N2)
   2 - filter((TO_NUMBER("C1")>=:N1 AND TO_NUMBER("C1")<=:N2))

Bind Descriptions:
  Pos  MLen  Data Type   Bind Name
    1    22  NUMBER      N1
    2    22  NUMBER      N2

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
BIND_MISMATCH - The bind metadata does not match the existing child cursor.

SQL_ID  cguuvrry2hx1z, child number 1
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   C1 BETWEEN :N1 AND :N2

Plan hash value: 3390613643

------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |       |       |  1001 (100)|          |        |
|*  1 |  FILTER                      |              |       |       |            |          |        |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1           |  2500 |   639K|  1001   (1)| 00:00:05 |    CHK |
|*  3 |    INDEX RANGE SCAN          | SYS_C0020503 |  4500 |       |    20   (0)| 00:00:01 |    CHK |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:N1<=:N2)
   3 - access("C1">=:N1 AND "C1"<=:N2)

Bind Descriptions:
  Pos  MLen  Data Type   Bind Name
    1    32  VARCHAR2    N1
    2    32  VARCHAR2    N2

Notice the Predicate Information and Bind Descriptions for the first execution plan. Child #1 was created due to “The bind metadata does not match the existing child cursor”. Now another test:

ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=1000;

SELECT
  C1,
  C2
FROM
  T1
WHERE
  C1 BETWEEN :N1 AND :N2;

Statistics
--------------------------------------------------------------------------------
1 recursive calls
0 db block gets
102765 consistent gets
36619 physical reads
0 redo size
22387973 bytes sent via SQL*Net to client
733583 bytes received via SQL*Net from client
66661 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
999890 rows processed

This is the output from my utility program which shows another child cursor:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OPTIMIZER_MISMATCH - The optimizer environment does not match the existing child cursor.

SQL_ID  cguuvrry2hx1z, child number 2
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   C1 BETWEEN :N1 AND :N2

Plan hash value: 3332582666

------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |  1964 (100)|          |        |
|*  1 |  FILTER            |      |       |       |            |          |        |
|*  2 |   TABLE ACCESS FULL| T1   |   999K|   249M|  1964   (5)| 00:00:10 |    CHK |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(:N1<=:N2)
   2 - filter(("C1"<=:N2 AND "C1">=:N1))

Bind Descriptions:
  Pos  MLen  Data Type   Bind Name
    1    32  VARCHAR2    N1
    2    32  VARCHAR2    N2

Notice the reason for child #2 being created “The optimizer environment does not match the existing child cursor”.

I then disconnected from the database instance and reconnected to reset the OPTIMIZER_INDEX_COST_ADJ parameter to the system default. One more test, gather statistics forcing current plans to be invalidated (there is usually a delay when statistics are gathered before plans accessing those objects are invalidated):

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE, NO_INVALIDATE=>FALSE)

SET AUTOTRACE TRACEONLY STATISTICS

VARIABLE N1 VARCHAR2(20)
VARIABLE N2 VARCHAR2(20)

EXEC :N1 := '1'
EXEC :N2 := '9999'

SELECT
  C1,
  C2
FROM
  T1
WHERE
  C1 BETWEEN :N1 AND :N2;

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     102765  consistent gets
      35424  physical reads
          0  redo size
   22387973  bytes sent via SQL*Net to client
     733583  bytes received via SQL*Net from client
      66661  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     999890  rows processed

My utility program now shows that there is a single plan, which was affected by bind peeking.

SQL_ID  cguuvrry2hx1z, child number 0
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   C1 BETWEEN :N1 AND :N2

Plan hash value: 3332582666

------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |  1964 (100)|          |        |
|*  1 |  FILTER            |      |       |       |            |          |        |
|*  2 |   TABLE ACCESS FULL| T1   |   999K|   249M|  1964   (5)| 00:00:10 |    CHK |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:N1<=:N2)
   2 - filter(("C1"<=:N2 AND "C1">=:N1))

Bind Descriptions:
  Pos  MLen  Data Type   Bind Name
    1    32  VARCHAR2    N1
    2    32  VARCHAR2    N2
  • I had originally planned to mention the Oracle 11g introduced adaptive cursor sharing, which is another source of different plans for the same SQL statement. So, in summary some of the sources of the plan changing:
    * Parsing a SQL statement with the bind variable values not defined
    * Parsing a SQL statement with the bind variables defined as the wrong type (or specified as a different type)
    * Statistics collection forcing the invalidation of cursors, which forces a new hard parse on the next execution (on 10g and above, statistics collection does not cause an immediate invalidation of cursors accessing objects for which statistics were collected)
    * Changes in the statistics, or even lack of change in the statistics with supplied values in the SQL statement appearing to be increasingly above or below the maximum and minimum values for the columns.
    * Bind peeking, where different bind variable values were specified on each hard parse
    * Different optimizer environment for the session executing the SQL statement
    * Adaptive cursor sharing in 11g




SQL PLAN_HASH_VALUE Change When Literals Change?

1 12 2009

In this OTN thread:
http://forums.oracle.com/forums/thread.jspa?messageID=3697025

A user asked the following question:

Is there any way to identify the sqls with same execution plan, for us the problem is, sql is submitted with parameter hardcoded in where clause, which changes the sqlid and plan hash value, as the plan hash value changes but the infact plan remains same, i want to identify such sqlid?

The PLAN_HASH_VALUE should remain the same if the plan is the same, even if hardcoded values are specified in the WHERE clause. For example, a table T1 with an ID column and a DESCRIPTION column with an index on the ID column – first, four SQL statements are executed with different values specified for ID in the WHERE clause:

SELECT /*+ TEST FIND ME */ 1 FROM T1 WHERE ID=1;
SELECT /*+ TEST FIND ME */ 2 FROM T1 WHERE ID=2;
SELECT /*+ TEST FIND ME */ 3 FROM T1 WHERE ID=3;
SELECT /*+ TEST FIND ME */ 4 FROM T1 WHERE ID=4;

 Now we search the V$SQL view for the SQL_IDs and PLAN_HASH_VALUEs:

SELECT
  SQL_ID,
  CHILD_NUMBER,
  PLAN_HASH_VALUE
FROM
  V$SQL
WHERE
  SQL_TEXT LIKE '% TEST FIND ME %';
 
SQL_ID        CHILD_NUMBER PLAN_HASH_VALUE
------------- ------------ ---------------
fbvgxdbddnush            0      1882569892
5f0415sr6d7qb            0      1882569892
0fpndy65tykxv            0      1882569892
5jvj4jyzyb9jt            0      1882569892
0kjp2dk2vbpra            0       903671040
 
SELECT
  SQL_TEXT
FROM
  V$SQL
WHERE
  SQL_ID='0kjp2dk2vbpra';
 
SQL_TEXT
--------------------------------------------------------------------------------
SELECT   SQL_ID,   CHILD_NUMBER,   PLAN_HASH_VALUE FROM   V$SQL WHERE   SQL_TEXT
 LIKE '% TEST FIND ME %'

The only SQL statement with a different plan hash value is the SQL statement which performed the query on V$SQL. Next, we retrieve the plans for the SQL statements:

SET LINESIZE 150
SET PAGESIZE 2000
SPOOL TEST_MY_PLANS.TXT
 
SELECT /*+ LEADING(S) */
  T.PLAN_TABLE_OUTPUT
FROM
  (SELECT
    SQL_ID,
    CHILD_NUMBER
  FROM
    V$SQL
  WHERE
    SQL_TEXT LIKE '% TEST FIND ME %') S,
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(S.SQL_ID,S.CHILD_NUMBER,'TYPICAL')) T;
 
SPOOL OFF

The spool file included the following:

SQL_ID  fbvgxdbddnush, child number 0
-------------------------------------
SELECT /*+ TEST FIND ME */ 3 FROM T1 WHERE ID=3
 
Plan hash value: 1882569892
 
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |       |       |    24 (100)|          |
|*  1 |  INDEX RANGE SCAN| IND_T1 | 10000 | 40000 |    24   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=3)
 
 
SQL_ID  5f0415sr6d7qb, child number 0
-------------------------------------
SELECT /*+ TEST FIND ME */ 2 FROM T1 WHERE ID=2
 
Plan hash value: 1882569892
 
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |       |       |    24 (100)|          |
|*  1 |  INDEX RANGE SCAN| IND_T1 | 10000 | 40000 |    24   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=2)
 
 
SQL_ID  0fpndy65tykxv, child number 0
-------------------------------------
SELECT /*+ TEST FIND ME */ 4 FROM T1 WHERE ID=4
 
Plan hash value: 1882569892
 
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |       |       |    24 (100)|          |
|*  1 |  INDEX RANGE SCAN| IND_T1 | 10000 | 40000 |    24   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=4)
 
 
SQL_ID  5jvj4jyzyb9jt, child number 0
-------------------------------------
SELECT /*+ TEST FIND ME */ 1 FROM T1 WHERE ID=1
 
Plan hash value: 1882569892
 
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |       |       |    24 (100)|          |
|*  1 |  INDEX RANGE SCAN| IND_T1 | 10000 | 40000 |    24   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=1)
 
 
SQL_ID  0kjp2dk2vbpra, child number 0
-------------------------------------
SELECT   SQL_ID,   CHILD_NUMBER,   PLAN_HASH_VALUE FROM   V$SQL WHERE
SQL_TEXT LIKE '% TEST FIND ME %'
 
Plan hash value: 903671040
 
---------------------------------------------------------------------------
| Id  | Operation        | Name              | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                   |       |       |     1 (100)|
|*  1 |  FIXED TABLE FULL| X$KGLCURSOR_CHILD |     1 |   549 |     0   (0)|
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("KGLNAOBJ" LIKE '% TEST FIND ME %' AND
              "INST_ID"=USERENV('INSTANCE')))

So, the answer is that the SQL PLAN_HASH_VALUE does not change when literals in the WHERE clause change, if the actual execution plan is the same.





SQL Date Ranges with Connect By

1 12 2009

A question appeared in one of the OTN forums:
http://forums.oracle.com/forums/thread.jspa?messageID=3891491

USER_NAME    FROM_DATE  TO_DATE    COMMENTS
--------------------------------------------------------------------------------
Uma Shankar   2-Nov-09   5-Nov-09  Comment1
Veera        11-Nov-09  13-Nov-09  Comment2
Uma Shankar  15-Dec-09  17-Dec-09  Commnet3
Vinod        20-Oct-09  21-Oct-09  Comments4

The above table is the user leave calendar.  Now I need to display the users who are on leave between 01-Nov-2009 to 30-Nov-2009.  The output should look like:

USER_NAME   FROM_DATE COMMENTS
--------------------------------------------------------------------------------
Uma Shankar  2-Nov-09 Comment1
Uma Shankar  3-Nov-09 Comment1
Uma Shankar  4-Nov-09 Comment1
Uma Shankar  5-Nov-09 Comment1
Veera       11-Nov-09 Comment2
Veera       12-Nov-09 Comment2
Veera       13-Nov-09 Comment2

How would one go about finding a solution for the above problem?  Possibly by breaking the problem into smaller steps.

Here is a test set up:

CREATE TABLE T1(
  USERNAME VARCHAR2(30),
  FROM_DATE DATE,
  TO_DATE DATE,
  COMMENTS VARCHAR2(100));

INSERT INTO T1 VALUES ('Uma Shankar', '02-Nov-09','05-Nov-09','Comment1');
INSERT INTO T1 VALUES ('Veera','11-Nov-09','13-Nov-09','Comment2');
INSERT INTO T1 VALUES ('Uma Shankar','15-Dec-09','17-Dec-09','Commnet3');
INSERT INTO T1 VALUES ('Vinod','20-Oct-09','21-Oct-09','Comments4');
INSERT INTO T1 VALUES ('Mo','20-Oct-09','05-NOV-09','Comments4');

COMMIT;

Note that I included one additional row, where the person starts their vacation in the previous month and ends in the month of November.  You could approach the problem like this:
Assume that you would like to list all of the days of a particular month:

SELECT
  TO_DATE('01-NOV-2009','DD-MON-YYYY')+(ROWNUM-1) MONTH_DAY
FROM
  DUAL
CONNECT BY
  LEVEL<=ADD_MONTHS(TO_DATE('01-NOV-2009','DD-MON-YYYY'),1)
           -TO_DATE('01-NOV-2009','DD-MON-YYYY');

Note that the above attempts to calculate the number of days in the month of November – if it is known that the month has a particular number of days, 30 for instance, you could rewrite the CONNECT BY clause like this:

CONNECT BY
  LEVEL<=30

 Now, we need to pick up those rows of interest from the table:

SELECT
  *
FROM
  T1 T
WHERE
  (T.FROM_DATE BETWEEN TO_DATE('01-NOV-2009','DD-MON-YYYY') AND TO_DATE('30-NOV-2009','DD-MON-YYYY')
    OR T.TO_DATE BETWEEN TO_DATE('01-NOV-2009','DD-MON-YYYY') AND TO_DATE('30-NOV-2009','DD-MON-YYYY'));

USERNAME        FROM_DATE TO_DATE   COMMENTS
--------------- --------- --------- ----------
Uma Shankar     02-NOV-09 05-NOV-09 Comment1
Veera           11-NOV-09 13-NOV-09 Comment2
Mo              20-OCT-09 05-NOV-09 Comments4

If we then join the two resultsets, we have the following query:

SELECT
  *
FROM
  T1 T,
  (SELECT
    TO_DATE('01-NOV-2009','DD-MON-YYYY')+(ROWNUM-1) MONTH_DAY
  FROM
    DUAL
  CONNECT BY
    LEVEL<=ADD_MONTHS(TO_DATE('01-NOV-2009','DD-MON-YYYY'),1)-TO_DATE('01-NOV-2009','DD-MON-YYYY')) V
WHERE
  (T.FROM_DATE BETWEEN TO_DATE('01-NOV-2009','DD-MON-YYYY') AND TO_DATE('30-NOV-2009','DD-MON-YYYY')
    OR T.TO_DATE BETWEEN TO_DATE('01-NOV-2009','DD-MON-YYYY') AND TO_DATE('30-NOV-2009','DD-MON-YYYY'))
  AND V.MONTH_DAY BETWEEN T.FROM_DATE AND T.TO_DATE
ORDER BY
  USERNAME,
  MONTH_DAY;

USERNAME        FROM_DATE TO_DATE   COMMENTS   MONTH_DAY
--------------- --------- --------- ---------- ---------
Mo              20-OCT-09 05-NOV-09 Comments4  01-NOV-09
Mo              20-OCT-09 05-NOV-09 Comments4  02-NOV-09
Mo              20-OCT-09 05-NOV-09 Comments4  03-NOV-09
Mo              20-OCT-09 05-NOV-09 Comments4  04-NOV-09
Mo              20-OCT-09 05-NOV-09 Comments4  05-NOV-09
Uma Shankar     02-NOV-09 05-NOV-09 Comment1   02-NOV-09
Uma Shankar     02-NOV-09 05-NOV-09 Comment1   03-NOV-09
Uma Shankar     02-NOV-09 05-NOV-09 Comment1   04-NOV-09
Uma Shankar     02-NOV-09 05-NOV-09 Comment1   05-NOV-09
Veera           11-NOV-09 13-NOV-09 Comment2   11-NOV-09
Veera           11-NOV-09 13-NOV-09 Comment2   12-NOV-09
Veera           11-NOV-09 13-NOV-09 Comment2   13-NOV-09




What Happens when 1=2 is Included in a WHERE Clause?

1 12 2009

The following appeared in a Usenet post:

Surprisingly, the optimizer sometimes seems to do a bad job about such constant clauses. As in “where 1 = 2″ resulting in a full table scan…

The above statement is actually correct (OK, partially).  However, it might be a good idea to check the DBMS_XPLAN output and/or the output of a 10046 trace and/or the output of SET AUTOTRACE TRACEONLY STATISTICS.  For example, here is a test on Oracle 11.1.0.7 with a 100,000,000 row table with a primary key column in a freshly bounced database:

SET AUTOTRACE TRACEONLY EXPLAIN
SELECT 
  * 
FROM 
  T1 
WHERE 
  1=2;

Execution Plan 
---------------------------------------------------------- 
Plan hash value: 3332582666 
 
--------------------------------------------------------------------------- 
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT   |      |     1 |    53 |       0 (0)|          | 
|*  1 |  FILTER            |      |       |       |            |          | 
|   2 |   TABLE ACCESS FULL| T1   |   100M|  5054M|   221K  (1)| 00:44:14 | 
--------------------------------------------------------------------------- 
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
   1 - filter(NULL IS NOT NULL)

The plan indicates a full table scan, and the filter predicate indicates?
Continuing:

SET AUTOTRACE TRACEONLY STATISTICS 
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'My_Trace'; 
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8'; 
 
SELECT 
  * 
FROM 
  T1 
WHERE 
  1=2; 
 
Statistics 
---------------------------------------------------------- 
   1  recursive calls 
   0  db block gets 
   0  consistent gets 
   0  physical reads 
   0  redo size 
 301  bytes sent via SQL*Net to client 
 349  bytes received via SQL*Net from client 
   1  SQL*Net roundtrips to/from client 
   0  sorts (memory) 
   0  sorts (disk) 
   0  rows processed 
 
 
SELECT SYSDATE FROM DUAL;

A full table scan on a 100,000,000 row table which performed 0 consistent gets, interesting….

Now, a check of the 10046 extended SQL trace file:

===================== 
PARSING IN CURSOR #7 len=32 dep=0 uid=56 oct=3 lid=56 tim=327318181 hv=2373026659 ad='2775adb30' sqlid='077d50q6r30v3' 
SELECT 
  * 
FROM 
  T1 
WHERE 
  1=2 
END OF STMT 
PARSE #7:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=3332582666,tim=327318181 
EXEC #7:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3332582666,tim=327318181 
WAIT #7: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=527 tim=327322648 
FETCH #7:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3332582666,tim=327318181 
STAT #7 id=1 cnt=0 pid=0 pos=1 obj=0 op='FILTER  (cr=0 pr=0 pw=0 time=0 us)' 
STAT #7 id=2 cnt=0 pid=1 pos=1 obj=68961 op='TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 time=0 us cost=221088 size=5300000000 card=100000000)' 
WAIT #7: nam='SQL*Net message from client' ela= 11075 driver id=1413697536 #bytes=1 p3=0 obj#=527 tim=327343489

The STAT lines in the 10046 trace also confirm that Oracle did not even bother to execute the full table scan.
Just for confirmation that rows do exist in the table:

SELECT 
  COUNT(*) 
FROM 
  T1; 
 
Statistics 
---------------------------------------------------------- 
     1  recursive calls 
     0  db block gets 
813324  consistent gets 
813317  physical reads 
     0  redo size 
   342  bytes sent via SQL*Net to client 
   360  bytes received via SQL*Net from client 
     2  SQL*Net roundtrips to/from client 
     0  sorts (memory) 
     0  sorts (disk) 
     1  rows processed 




10046 Extended SQL Trace Interpretation

1 12 2009

December 1, 2009

(Forward to the Next Post in the Series)

10046 extended SQL trace files provide a significant level of detail specific to a single session.  Extended SQL traces are enabled by setting event 10046 to level 4 (bind variables), level 8 (wait events), or level 12 (bind variables and wait events).  There are several methods that may be used to set event 10046, partially dependent on the Oracle release version, and whether or not the event may be set interactively within the session:

  ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';
-or-
  EXEC SYS.DBMS_SYSTEM.SET_EV(SID,SERIAL#,10046,12,'');
-or-
  EXEC SYS.DBMS_SUPPORT.START_TRACE_IN_SESSION(SID,SERIAL#,WAITS=>TRUE,BINDS=>FALSE);
-or-
  EXEC DBMS_MONITOR.SESSION_TRACE_ENABLE(SESSION_ID=>SID,SERIAL_NUM=>SERIAL#,WAITS=>TRUE,BINDS=>TRUE)
-or-
  EXEC SYS.DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION(SID,SERIAL#,TRUE);

The SID and SERIAL# may be retrieved from V$SESSION for the session of interest.  The SYS.DBMS_SYSTEM.SET_EV method is unsupported, but it works fine on older versions of Oracle where the other methods do not exist.  Not all of the above methods work in older Oracle releases.  Oracle will create the trace files in the server’s udump directory (or the trace directory with release 11.1.0.6 and later).  A logon trigger may be used to enable a 10046 extended SQL trace, possibly enabling a trace for all sessions that start the same program, or for a specific user.  10046 traces may also be enabled with the SQL*Plus ORADEBUG functionality.

Before executing any of the above commands (especially if enabled using an ALTER SYSTEM method), determine how to disable a 10046 trace, depending on how it was started:

  ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT OFF';
-or-
  EXEC SYS.DBMS_SYSTEM.SET_EV(SID,SERIAL#,10046,0,'');
-or-
  EXEC SYS.DBMS_SUPPORT.STOP_TRACE_IN_SESSION(SID,SERIAL#);
-or-
  EXEC SYS.DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION(SID,SERIAL#,FALSE);

A short summary of several keywords found in a 10046 extended SQL trace file:
len= the number of characters in the SQL statement
dep= tells the application/trigger depth at which the SQL statement was executed. dep=0 indicates that it was executed by the client application. dep=1 indicates that the SQL statement was executed by a trigger, the Oracle optimizer, or a space management call. dep=2 indicates that the SQL statement was called from a trigger, dep=3 indicates that the SQL statement was called from a trigger that was called from a trigger.
tim= is a time stamp measured in 1/1,000,000 of a second that may be used for tracking the progress in the 10046 trace file – take the delta value of the most recent tim=, subtract the first tim=, divide the result by 1,000,000 – equals the number of seconds into the trace file.
c= CPU seconds – amount of CPU resources required at that point in the execution. On 8.1.7.4 and earlier, divide by 100 to obtain the number of seconds. After 8.1.7.4, divide by 1,000,000 to obtain the number of seconds.
e= Elapsed seconds – number of seconds as measured by a high precision clock at that point in the execution. This may be significantly different from CPU seconds. On 8.1.7.4 and earlier, divide by 100 to obtain the number of seconds. After 8.1.7.4, divide by 1,000,000 to obtain the number of seconds.
p= Physical blocks read from disk.
cr= Consistent reads
cu= Consistent reads in current mode
mis= Number of shared pool misses at that stage of execution.
r= Number of rows returned by the call
og= Optimizer goal – 1=ALL_ROWS, 2=FIRST_ROWS (and FIRST_ROWS_n), 3=RULE, 4=CHOOSE
XCTEND rlbk=0 indicates that the client issued a COMMIT, rd_only=1 indicates that no information changed in the database as a result of the COMMIT
ela= is the elapsed time in 1/100 of a second on 8.1.7.4 and earlier, and in 1/1,000,000 of a second on versions after 8.1.7.4.
p1= meaning depends on the wait event
p2= meaning depends on the wait event
p3= meaning depends on the wait event

STAT lines – the row source execution plan:
id= line identifier
cnt= number of rows returned or processed
pid= parent operation identifier – used to determine indentation of the plan
obj= object id of the object referenced at that stage of the plan

Putting the pieces together, it is possible to do the following analysis of a raw 10046 extended SQL trace file:

PARSING IN CURSOR #12 len=170 dep=0 uid=31 oct=3 lid=31 tim=768609360 hv=205969412 ad='a64cba88'
SELECT C1 , SUBSTR(C2,1,20) C2  FROM T7 WHERE   
  EXISTS ( 
    SELECT 
      1 
    FROM 
      T8 
    WHERE 
      T8.C1=T7.C1)
  AND T7.C1 BETWEEN :1 AND :2
END OF STMT
PARSE #12:c=0,e=29,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=768609356
BINDS #12:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000010 frm=00 csi=178 siz=48 off=0
  kxsbbbfp=151423d0  bln=22  avl=02  flg=05
  value=1
 Bind#1
  oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000010 frm=00 csi=178 siz=0 off=24
  kxsbbbfp=151423e8  bln=22  avl=03  flg=01
  value=81000
EXEC #12:c=0,e=807,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=768611110
WAIT #12: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=114176 tim=768611226
WAIT #12: nam='db file sequential read' ela= 184 file#=4 block#=1779171 blocks=1 obj#=114178 tim=768611561
WAIT #12: nam='db file scattered read' ela= 213 file#=4 block#=1779172 blocks=5 obj#=114178 tim=768612046
WAIT #12: nam='db file sequential read' ela= 103 file#=4 block#=1138380 blocks=1 obj#=114177 tim=768612399
WAIT #12: nam='db file sequential read' ela= 87 file#=4 block#=1754469 blocks=1 obj#=114177 tim=768612648
WAIT #12: nam='db file sequential read' ela= 88 file#=4 block#=1138382 blocks=1 obj#=114177 tim=768612865
WAIT #12: nam='db file sequential read' ela= 81 file#=4 block#=1138374 blocks=1 obj#=114176 tim=768613103
FETCH #12:c=0,e=2195,p=10,cr=51,cu=0,mis=0,r=14,dep=0,og=1,tim=768613513
WAIT #12: nam='SQL*Net message from client' ela= 3432 driver id=1413697536 #bytes=1 p3=0 obj#=114176 tim=768617150
STAT #12 id=1 cnt=14 pid=0 pos=1 obj=0 op='FILTER  (cr=51 pr=10 pw=0 time=2170 us)'
STAT #12 id=2 cnt=14 pid=1 pos=1 obj=0 op='NESTED LOOPS  (cr=51 pr=10 pw=0 time=2137 us)'
STAT #12 id=3 cnt=14 pid=2 pos=1 obj=0 op='SORT UNIQUE (cr=7 pr=6 pw=0 time=984 us)'
STAT #12 id=4 cnt=14 pid=3 pos=1 obj=114178 op='TABLE ACCESS FULL T8 (cr=7 pr=6 pw=0 time=900 us)'
STAT #12 id=5 cnt=14 pid=2 pos=2 obj=114176 op='TABLE ACCESS BY INDEX ROWID T7 (cr=44 pr=4 pw=0 time=1131 us)'
STAT #12 id=6 cnt=14 pid=5 pos=1 obj=114177 op='INDEX UNIQUE SCAN SYS_C0020535 (cr=30 pr=3 pw=0 time=820 us)'

A simple walk through of the trace file, mapping the trace file to a timeline (this is output from my Toy Project for Performance Tuning):

Cursor 12   Ver 1   Parse at 0.000000
SELECT C1 , SUBSTR(C2,1,20) C2  FROM T7 WHERE   
  EXISTS ( 
    SELECT 
      1 
    FROM 
      T8 
    WHERE 
      T8.C1=T7.C1)
  AND T7.C1 BETWEEN :1 AND :2

Cursor 12   Ver 1   Parse at -0.000004 (Parse to Parse -0.000004),CPU Time 0.000000,
            Elapsed Time 0.000029,Rows Affected 0,Blks from Buff 0,Blks from Disk 0,Goal=ALL_ROWS
Bind Variables:BINDS #12:  -0.000004
   Bind#0
    oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
    oacflg=00 fl2=1000010 frm=00 csi=178 siz=48 off=0
    kxsbbbfp=151423d0  bln=22  avl=02  flg=05
    value=1
   Bind#1
    oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
    oacflg=00 fl2=1000010 frm=00 csi=178 siz=0 off=24
    kxsbbbfp=151423e8  bln=22  avl=03  flg=01
    value=81000

Cursor 12   Ver 1 Execute at 0.001750 (Parse to Exec  0.001750),CPU Time 0.000000,
            Elapsed Time 0.000807,Rows Affected 0,Blks from Buff 0,Blks from Disk 0,
Cursor 12   Ver 1            0.001866  0.000004   SQL*Net message to client
Cursor 12   Ver 1            0.002201  0.000184   db file sequential read
Cursor 12   Ver 1            0.002686  0.000213   db file scattered read
Cursor 12   Ver 1            0.003039  0.000103   db file sequential read
Cursor 12   Ver 1            0.003288  0.000087   db file sequential read
Cursor 12   Ver 1            0.003505  0.000088   db file sequential read
Cursor 12   Ver 1            0.003743  0.000081   db file sequential read
Cursor 12   Ver 1   Fetch at 0.004153 (Parse to Fetch 0.004153),CPU Time 0.000000,
            Elapsed Time 0.002195,Rows Retrievd 14,Blks from Buff 51,Blks from Disk 10
Cursor 12   Ver 1            0.007790  0.003432   SQL*Net message from client

Sometimes mapping the trace file to a timeline is not the best approach, but understanding the role of a specific SQL statement is more critical to understanding the cause of a performance problem (this is output from my Toy Project for Performance Tuning):

Similar SQL Statements in Group: 1
|PARSEs 1|CPU S 0.000000|CLOCK S 0.000029|ROWs  0|PHY RD BLKs  0|CON RD BLKs (Mem)  0|CUR RD BLKs (Mem) 0|SHARED POOL MISs 0|
|EXECs  1|CPU S 0.000000|CLOCK S 0.000807|ROWs  0|PHY RD BLKs  0|CON RD BLKs (Mem)  0|CUR RD BLKs (Mem) 0|SHARED POOL MISs 0|
|FETCHs 1|CPU S 0.000000|CLOCK S 0.002195|ROWs 14|PHY RD BLKs 10|CON RD BLKs (Mem) 51|CUR RD BLKs (Mem) 0|SHARED POOL MISs 0|

Cursor 12   Ver 1   Parse at 0.000000  Similar Cnt 1
|PARSEs 1|CPU S 0.000000|CLOCK S 0.000029|ROWs  0|PHY RD BLKs  0|CON RD BLKs (Mem)  0|CUR RD BLKs (Mem) 0|SHARED POOL MISs 0|
|EXECs  1|CPU S 0.000000|CLOCK S 0.000807|ROWs  0|PHY RD BLKs  0|CON RD BLKs (Mem)  0|CUR RD BLKs (Mem) 0|SHARED POOL MISs 0|
|FETCHs 1|CPU S 0.000000|CLOCK S 0.002195|ROWs 14|PHY RD BLKs 10|CON RD BLKs (Mem) 51|CUR RD BLKs (Mem) 0|SHARED POOL MISs 0|
  CLOCK S 98.67%
|++++++++++++++++++++|

SELECT C1 , SUBSTR(C2,1,20) C2  FROM T7 WHERE   
  EXISTS ( 
    SELECT 
      1 
    FROM 
      T8 
    WHERE 
      T8.C1=T7.C1)
  AND T7.C1 BETWEEN :1 AND :2

      (Rows 14)   FILTER  (cr=51 pr=10 pw=0 time=2170 us)
      (Rows 14)    NESTED LOOPS  (cr=51 pr=10 pw=0 time=2137 us)
      (Rows 14)     SORT UNIQUE (cr=7 pr=6 pw=0 time=984 us)
      (Rows 14)      TABLE ACCESS FULL T8 (cr=7 pr=6 pw=0 time=900 us)
      (Rows 14)     TABLE ACCESS BY INDEX ROWID T7 (cr=44 pr=4 pw=0 time=1131 us)
      (Rows 14)      INDEX UNIQUE SCAN SYS_C0020535 (cr=30 pr=3 pw=0 time=820 us)

Bind Variables:BINDS #12:  -0.000004
   Bind#0
    oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
    oacflg=00 fl2=1000010 frm=00 csi=178 siz=48 off=0
    kxsbbbfp=151423d0  bln=22  avl=02  flg=05
    value=1
   Bind#1
    oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
    oacflg=00 fl2=1000010 frm=00 csi=178 siz=0 off=24
    kxsbbbfp=151423e8  bln=22  avl=03  flg=01
    value=81000

Of course, if the STATISTICS_LEVEL parameter is set to ALL, or a GATHER_PLAN_STATISTICS hint is provided, the following will retrieve the actual execution plan for the test query, if executed immediately after the test query:

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'))

The output of the above for this SQL statement follows:

SQL_ID  bkhdgq864dq04, child number 0
-------------------------------------
SELECT C1 , SUBSTR(C2,1,20) C2  FROM T7 WHERE       EXISTS (       SELECT         1       FROM         T8       WHERE        
T8.C1=T7.C1)    AND T7.C1 BETWEEN :1 AND :2

Plan hash value: 1996673128

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------
|*  1 |  FILTER                       |              |      1 |        |     14 |00:00:00.01 |      51 |     10 |       |       |          |
|   2 |   NESTED LOOPS                |              |      1 |      1 |     14 |00:00:00.01 |      51 |     10 |       |       |          |
|   3 |    SORT UNIQUE                |              |      1 |      1 |     14 |00:00:00.01 |       7 |      6 |  9216 |  9216 | 8192  (0)|
|*  4 |     TABLE ACCESS FULL         | T8           |      1 |      1 |     14 |00:00:00.01 |       7 |      6 |       |       |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T7           |     14 |      1 |     14 |00:00:00.01 |      44 |      4 |       |       |          |
|*  6 |     INDEX UNIQUE SCAN         | SYS_C0020535 |     14 |      1 |     14 |00:00:00.01 |      30 |      3 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:1<=:2)
   4 - filter(("T8"."C1">=:1 AND "T8"."C1"<=:2))
   6 - access("T8"."C1"="T7"."C1")
       filter(("T7"."C1"<=:2 AND "T7"."C1">=:1))

The raw 10046 Extended SQL trace offers several advantages over the DBMS_XPLAN output if the P1, P2, and P3 values from the wait events are examined closely:

nam='db file scattered read' ela= 213 file#=4 block#=1779172 blocks=5 obj#=114178 tim=768612046

In the above, a multi-block read from disk requiring 0.000213 seconds of absolute file number 4, starting at block 1779172, and extending for 5 blocks was performed.  The blocks belong to object ID 114178 and happened at the trace file’s sequential time of 768612046, which may be referenced to the sequential time of other lines in the trace file.

Much more information about 10046 extended SQL traces is provided in the the chapters I co-wrote for the “Expert Oracle Practices: Oracle Database Administration from the Oak Table” book.








Follow

Get every new post delivered to your Inbox.

Join 139 other followers