Finding the Next Primary Key Value, a Pre-fixed Solution

20 01 2012

January 20, 2012

A request for assistance came in from an ERP mailing list.  The original poster (OP) is running an unspecified version of Oracle Database 9i, and is in need of a solution to generate new part numbers with prefixed characters that describe the type of part, followed by a sequential number that is unique to the prefix.  The prefixes might be PAINT, BAR, BEARING, DRILL, etc.  Sample part numbers might include BAR0599, PAINT012, BEARING012345, etc.

When I first saw the request, my first thought was to create sequences for the different prefixes, similar to the following:

CREATE SEQUENCE PART_PAINT_ID START WITH 13 NOCACHE;
CREATE SEQUENCE PART_BAR_ID START WITH 600 NOCACHE;
CREATE SEQUENCE PART_BEARING_ID START WITH 12346 NOCACHE;
CREATE SEQUENCE PART_DRILL_ID START WITH 999 NOCACHE;
...

Once the above sequences are created, we could then find the next part number with a SQL statement similar to the following:

SELECT
  'PAINT'||PART_PAINT_ID.NEXTVAL NEXT_PART_ID
FROM
  DUAL;

NEXT_PART_ID
------------
PAINT13 

I suspected that there was a catch – for some reason the sample part numbers included a 0 before the sequence number, and I assumed that there could be a variable number of 0 digits before that sequence number for the different prefixes.  To fix the above, we might try working with the LPAD function to add leading zeros to the sequence number:

SELECT
  'PAINT'||LPAD(TO_CHAR(PART_PAINT_ID.NEXTVAL), 3, '0') NEXT_PART_ID
FROM
  DUAL;

NEXT_PAR
--------
PAINT014 

Need more leading zeros?  No problem, just adjust the number in the LPAD function:

SELECT
  'PAINT'||LPAD(TO_CHAR(PART_PAINT_ID.NEXTVAL), 6, '0') NEXT_PART_ID
FROM
  DUAL;

NEXT_PART_I
-----------
PAINT000015 

Need fewer zeros?:

SELECT
  'PAINT'||LPAD(TO_CHAR(PART_PAINT_ID.NEXTVAL), 1, '0') NEXT_PART_ID
FROM
  DUAL;

NEXT_P
------
PAINT1 

The above shows the first of several potential problems with this approach.  What else could go wrong?  What if a smart person decides that he does not need this “crutch” solution and creates 30 part numbers using the method of best guess or cheat sheet in the side drawer?  Because these are primary key values, the smart person might cause a number of problems that might not be detected for some time… until the Oracle sequence reaches one of the unexpected sequence numbers that were already used.

In addition to the suggestion of Oracle sequences, I offered a couple of other suggestions.  The ERP system offers a macro language that is a lot like VBScript.  The OP could create a VBScript that builds a HTML web page in real time, or possibly pull the next sequence number from a ASP (or similar) web page.  As an example of a HTML web page built in real time, this is an example that I created roughly three years ago:

Dim objIE
Dim objShell
Dim strHTML
Dim intFlag

On Error Resume Next

Set objShell = CreateObject("WScript.Shell")
strHTML = strHTML & "<form name=""Visual"">" & vbCrLf
strHTML = strHTML & "<input type=hidden id=txtOK value="" "">" & vbCrLf

strHTML = strHTML & "<table>" & vbCrLf
strHTML = strHTML & "<tr><td>Component<td><select size=""1"" id=""cboComponent"" name=""cboComponent"" style=""width:150"">" & vbCrLf
strHTML = strHTML & "<option value=""ACTIVATOR"">ACTIVATOR</option>" & vbCrLf
strHTML = strHTML & "<option value=""ACCELERATOR"">ACCELERATOR</option>" & vbCrLf
strHTML = strHTML & "<option value=""CATALYST"">CATALYST</option>" & vbCrLf
strHTML = strHTML & "<option value=""EPOXY PRIMER"">EPOXY PRIMER</option>" & vbCrLf
strHTML = strHTML & "<option value=""PRIMER"">PRIMER</option>" & vbCrLf
strHTML = strHTML & "<option value=""REDUCER"">REDUCER</option>" & vbCrLf
strHTML = strHTML & "<option value=""TOP COAT"">TOP COAT</option>" & vbCrLf
strHTML = strHTML & "</select></td></tr>" & vbCrLf

strHTML = strHTML & "<tr><td>Vendor<td><select size=""1"" id=""cboVendor"" name=""cboVendor"" style=""width:150"">" & vbCrLf
strHTML = strHTML & "<option value=""DUPONT"">DUPONT</option>" & vbCrLf
strHTML = strHTML & "<option value=""LILLY"">LILLY</option>" & vbCrLf
strHTML = strHTML & "<option value=""NILES CHEMICAL"">NILES CHEMICAL</option>" & vbCrLf
strHTML = strHTML & "<option value=""MANITOWOC"">MANITOWOC</option>" & vbCrLf
strHTML = strHTML & "<option value=""MAUTZ"">MAUTZ</option>" & vbCrLf
strHTML = strHTML & "<option value=""PAINTS AND SOLVENTS"">PAINTS AND SOLVENTS</option>" & vbCrLf
strHTML = strHTML & "<option value=""SHEBOYGAN"">SHEBOYGAN</option>" & vbCrLf
strHTML = strHTML & "</select></td></tr>" & vbCrLf

strHTML = strHTML & "<tr><td>Type<td><select size=""1"" id=""cboType"" name=""cboType"" style=""width:150"">" & vbCrLf
strHTML = strHTML & "<option value=""FLAT"">FLAT</option>" & vbCrLf
strHTML = strHTML & "<option value=""GLOSS"">GLOSS</option>" & vbCrLf
strHTML = strHTML & "<option value=""MED. GLOSS"">MED. GLOSS</option>" & vbCrLf
strHTML = strHTML & "</select></td></tr>" & vbCrLf

strHTML = strHTML & "<tr><td>Color<td><select size=""1"" id=""cboColor"" name=""cboColor"" style=""width:150"">" & vbCrLf
strHTML = strHTML & "<option value=""RED"">RED</option>" & vbCrLf
strHTML = strHTML & "<option value=""YELLOW"">YELLOW</option>" & vbCrLf
strHTML = strHTML & "<option value=""GREEN"">GREEN</option>" & vbCrLf
strHTML = strHTML & "<option value=""BLUE"">BLUE</option>" & vbCrLf
strHTML = strHTML & "<option value=""BLACK"">BLACK</option>" & vbCrLf
strHTML = strHTML & "<option value=""WHITE"">WHITE</option>" & vbCrLf
strHTML = strHTML & "<option value=""GRAY"">GRAY</option>" & vbCrLf
strHTML = strHTML & "</select></td></tr>" & vbCrLf
strHTML = strHTML & "</table>" & vbCrLf

strHTML = strHTML & "<p><center><input type=button value=""OK"" id=cmdOK onclick=""document.getElementById('txtOK').value='OK';""></center>" & vbCrLf
strHTML = strHTML & "</form>" & vbCrLf

Set objIE = CreateObject("InternetExplorer.Application")
objIE.Navigate "about:blank"
objIE.Document.Body.InnerHTML = strHTML
objIE.Document.Title="Get Part Info"
objIE.Left = 0
objIE.Top = 0
objIE.Width = 400
objIE.Height = 400
objIE.Statusbar = False
objIE.Menubar = False
objIE.Toolbar = False

objIE.Visible = True

Do While objIE.Busy <> False
    objShell.Sleep 200
Loop

intFlag = 0

'loop until the button is clicked
Do While intFlag = 0
    If Err <> 0 Then
        IntFlag = -1
    End If   
    If objIE is Nothing Then
        'User closed ID
        intFlag = -1
    Else
        If objIE.Document.All.txtOK.Value <> " " Then
            intFlag = 1
        End If
    End If
    'objShell.Sleep 250 'Throws an error?
Loop

If intFlag = 1 Then
    'Copy in the values from the web page
    USER_1 = objIE.Document.Body.All.cboComponent.Value
    USER_2 = objIE.Document.Body.All.cboVendor.Value
    USER_3 = objIE.Document.Body.All.cboType.Value
    USER_4 = objIE.Document.Body.All.cboColor.Value

    objIE.Quit
End If

Set objIE = Nothing
Set objShell = Nothing
 

The result of the above VBScript is an interactive interface that appears similar to the following:

Another option that I suggested to the OP is to use Excel to keep track of the last sequence number for each prefix – and use an Excel dialog displayed from a VBScript macro.  Roughly three years ago I created a sample macro with the following code:

Dim objExcel
Dim objForm
Dim objShell

On Error Resume Next

Set objExcel = CreateObject("Excel.Application")

'with the help of custom program, set a 1 second delay, then force the window to the top
Set objShell = CreateObject("WScript.Shell")
objShell.Run("C:\BringToTop.exe " & Chr(34) & "Paint Naming" & Chr(34) & " 2")
Set objShell = Nothing

With objExcel
    .Workbooks.Open "C:\ExcelMacroDialog.xls"

    If .Sheets("CalculateArea").Cells(1, 1).Value <> "" Then
        ID = .Sheets("CalculateArea").Cells(1, 1).Value
        DESCRIPTION = .Sheets("CalculateArea").Cells(2, 1).Value
        PRODUCT_CODE = .Sheets("CalculateArea").Cells(3, 1).Value
        COMMODITY_CODE = .Sheets("CalculateArea").Cells(4, 1).Value
        USER_5 = .Sheets("CalculateArea").Cells(5, 1).Value
        PURCHASED = True
        FABRICATED = False
    End If
End With

objExcel.DisplayAlerts = False
objExcel.ActiveWorkbook.Saved = True
objExcel.Quit
Set objExcel = Nothing  

The above macro is quite short, because most of the work is performed in the Excel spreadsheet.  The resulting Excel dialog appeared similar to this:

My first suggestion to the OP, not knowing the full scope of the problem, was to try coding a VBScript macro similar to the following:

Dim strPartID
Dim strPartIDNew
Dim strNumberOld
Dim strNumberNew
Dim i

strPartID = PART_ID
strNumberOld = ""

'strPartID = "PAINT0599"  'Remove this line after testing
strPartID = "PAINT0089"  'Remove this line after testing

For i = Len(strPartID) to 1 Step -1
  If (Mid(strPartID, i, 1) >= "0") And (Mid(strPartID, i, 1) <= "9") Then
    strNumberOld = Mid(strPartID, i, 1) & strNumberOld
  Else
    strNumberNew = cStr(cInt(strNumberOld) + 1)

    strPartIDNew = Left(strPartID, i)
    If Len(strNumberOld) > Len(strNumberNew) Then
      'Add Padding 0s
      strPartIDNew = strPartIDNew & String((Len(strNumberOld) - Len(strNumberNew)), "0")
    End If

    strPartIDNew = strPartIDNew & strNumberNew

    Exit For
  End If
Next

If strPartIDNew <> "" Then
  Msgbox "The New Part ID is " & strPartIDNew
Else
  Msgbox "Not a Valid Starting Point" & strPartID
End If 

The intention of the above macro is to locate the number 89 in the supplied strPartID variable, recognize that a 4 digit serial number is expected, and output:

The New Part ID is PAINT0090 

Nice, but that is not what the OP needs.  The highest currently sequenced number will not be provided – that value must be looked up in the database.  So close…

Let’s try a different approach, starting by creating a sample table with three sequences of part numbers with different prefixes:

CREATE TABLE T1 (
  ID VARCHAR2(30),
  DESCRIPTION VARCHAR2(40),
  PRIMARY KEY(ID));

INSERT INTO
  T1
SELECT
  'PAINT'||LPAD(TO_CHAR(ROWNUM), 6, '0') ID,
  'TESTING ' || TO_CHAR(ROWNUM) DESCRIPTION
FROM
  DUAL
CONNECT BY
  LEVEL<=35000;

INSERT INTO
  T1
SELECT
  'BAR'||LPAD(TO_CHAR(ROWNUM), 6, '0') ID,
  'TESTING ' || TO_CHAR(ROWNUM) DESCRIPTION
FROM
  DUAL
CONNECT BY
  LEVEL<=45000;

INSERT INTO
  T1
SELECT
  'BEARING'||LPAD(TO_CHAR(ROWNUM), 6, '0') ID,
  'TESTING ' || TO_CHAR(ROWNUM) DESCRIPTION
FROM
  DUAL
CONNECT BY
  LEVEL<=888;

COMMIT; 

Let’s find the next sequence number for the BAR prefix:

SELECT
  MAX(TO_NUMBER(SUBSTR(ID,4))) + 1 NN
FROM
  T1
WHERE
  ID BETWEEN 'BAR0' AND 'BAR99999999';

        NN
----------
     45001 

As long as ALL of the characters after the BAR keyword prefix are numbers, the above would tell us that the next number with BAR as the prefix is 45001.  On Oracle Database 10.1 and higher it would be a good idea to add an additional predicate to the WHERE clause that uses regular expressions to avoid potential problems where some unrelated ID column values start with the letters BAR, a number character, and then at some position to the right contain a letter character (that condition would cause the above SQL statement to fail).

Building onto the above SQL statement, we could just retrieve the next part number in the sequence from the database, when provided any existing prefixed part number as the starting point:

SELECT
  'BAR' || LPAD(TO_CHAR(MAX(TO_NUMBER(SUBSTR(ID,4))) + 1), 6, '0') NEXT_PART_ID
FROM
  T1
WHERE
  ID BETWEEN 'BAR0' AND 'BAR99999999';

NEXT_PART
---------
BAR045001 

We are able to take the above SQL statement and incorporate it into a VBScript macro to find the next prefixed sequential number for the primary key column:

Dim strPartID
Dim strPartIDNew
Dim strNumberOld
Dim strNumberNew
Dim i
Dim strUserName
Dim strPassword
Dim strDatabase
Dim strSQL
Dim dbDatabase
Dim snpData

On Error Resume Next

strUsername = "MyUserID"
strPassword = "MyPassword"
strDatabase = "MyDatabase"

Set snpData = CreateObject("ADODB.Recordset")
Set dbDatabase = CreateObject("ADODB.Connection")

dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUserName & ";Password=" & strPassword & ";"
dbDatabase.Open

strNumberOld = ""

'strPartID = PART_ID
strPartID = "BEARING0599"  'Remove when finished testing
'strPartID = "BAR0089"  'Remove when finished testing
'strPartID = "PAINT0089"  'Remove when finished testing

For i = Len(strPartID) to 1 Step -1
  If (Mid(strPartID, i, 1) >= "0") And (Mid(strPartID, i, 1) <= "9") Then
    strNumberOld = Mid(strPartID, i, 1) & strNumberOld
  Else
    strPartIDNew = Left(strPartID, i)

    strSQL = "SELECT" & VBCrLf
    strSQL = strSQL & "  '" & strPartIDNew & "' || LPAD(TO_CHAR(MAX(TO_NUMBER(SUBSTR(ID," & (i+1) &"))) + 1), 6, '0') NEXT_PART_ID" & VBCrLf
    strSQL = strSQL & "FROM" & VBCrLf
    strSQL = strSQL & "  T1" & VBCrLf
    strSQL = strSQL & "WHERE" & VBCrLf
    strSQL = strSQL & "  ID BETWEEN '" & strPartIDNew & "0' AND '" & strPartIDNew & "99999999'"

    snpData.Open strSQL, dbDatabase

    If snpData.State = 1 Then
      If Not(snpData.EOF) Then
        strPartIDNew = snpData("next_part_id")
      End If
      snpData.Close
    End If
    Exit For
  End If
Next

If strPartIDNew <> "" Then
  Msgbox "The New Part ID is " & strPartIDNew
Else
  Msgbox "Not a Valid Starting Point" & strPartID
End If

dbDatabase.Close
Set snpData = Nothing
Set dbDatabase = Nothing 

The OP put together a parallel solution that also used a VBScript macro.  The macro sent a SQL statement very similar to the following to the database:

SELECT
  ID
FROM
  T1
WHERE
  ID LIKE 'BAR%'; 

In the macro code the OP parsed each of the returned ID values to determine the highest sequence number, added 1 to that value, padded the new highest sequence number with “0′ digits and output the result.  Most likely due to curiosity, the OP asked why I did not simply use his VBScript macro as part of my proposed solution.  What reasons do you think that I gave to the OP?





What Number Immediately Follows 3,999? Oracle Database Refuses to Answer

16 01 2012

January 16, 2012

I put together a test case to demonstrate how the physical reads autotrace statistic could exceed the consistent gets autotrace statistic if a single-pass or multi-pass workarea execution were performed during the execution of the SQL statement.  If you are interested, you can see the test case in this recent OTN thread.  If I recall correctly, index pre-fetching could also result in a similar situation where the physical reads autotrace statistic could exceed the consistent gets autotrace statistic.

The result in the OTN test case left me a little concerned.  It might not be well known, however it is possible to instruct Oracle Database to format normal base 10 numbers as Roman numerals with the RN format specification.  For example:

SELECT
  TO_CHAR(1, 'RN') ROMAN_NUMBER
FROM
  DUAL;

ROMAN_NUMBER
---------------
              I

SELECT
  TO_CHAR(12, 'RN') ROMAN_NUMBER
FROM
  DUAL;

ROMAN_NUMBER
---------------
            XII

SELECT
  TO_CHAR(123, 'RN') ROMAN_NUMBER
FROM
  DUAL;

ROMAN_NUMBER
---------------
         CXXIII

SELECT
  TO_CHAR(1234, 'RN') ROMAN_NUMBER
FROM
  DUAL;

ROMAN_NUMBER
---------------
       MCCXXXIV 

I personally think that is a neat feature.  In the OTN test case I created a table that would hopefully contain the first 9,999,999 Roman numbers:

CREATE TABLE T1 (
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 VARCHAR2(30) NOT NULL,
  C4 VARCHAR2(200));

INSERT INTO T1
SELECT
  MOD(ROWNUM-1, 90) * 4 C1,
  ROWNUM - 1 C2,
  TO_CHAR(ROWNUM - 1, 'RN') C3,
  LPAD('A',200,'A') C4
FROM
  (SELECT
     ROWNUM RN
   FROM
     DUAL
   CONNECT BY
     LEVEL<=1000000),
  (SELECT
     ROWNUM RN
   FROM
     DUAL
   CONNECT BY
     LEVEL<=10);

COMMIT;

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1') 

As part of the OTN test case, I thought that I would demonstrate a non-optimal work area execution by determining how many unique Roman numbers made it into the T1 test table:

SELECT
  COUNT(DISTINCT C3)
FROM
  T1;

COUNT(DISTINCTC3)
-----------------
             4000 

Just 4,000?  There are 10,000,000 rows in that test table.  So, maybe we should ask Oracle Database what number immediately follows 3,999 by querying the table:

SELECT
  C2,
  C3
FROM
  T1
WHERE
  C2 BETWEEN 3950 AND 4050
ORDER BY
  C2;

---------- ------------------------------
      3950          MMMCML
      3951         MMMCMLI
      3952        MMMCMLII
      3953       MMMCMLIII
      3954        MMMCMLIV
      3955         MMMCMLV
      3956        MMMCMLVI
      3957       MMMCMLVII
      3958      MMMCMLVIII
      3959        MMMCMLIX
      3960         MMMCMLX
      3961        MMMCMLXI
      3962       MMMCMLXII
      3963      MMMCMLXIII
      3964       MMMCMLXIV
      3965        MMMCMLXV
      3966       MMMCMLXVI
      3967      MMMCMLXVII
      3968     MMMCMLXVIII
      3969       MMMCMLXIX
      3970        MMMCMLXX
      3971       MMMCMLXXI
      3972      MMMCMLXXII
      3973     MMMCMLXXIII
      3974      MMMCMLXXIV
      3975       MMMCMLXXV
      3976      MMMCMLXXVI
      3977     MMMCMLXXVII
      3978    MMMCMLXXVIII
      3979      MMMCMLXXIX
      3980       MMMCMLXXX
      3981      MMMCMLXXXI
      3982     MMMCMLXXXII
      3983    MMMCMLXXXIII
      3984     MMMCMLXXXIV
      3985      MMMCMLXXXV
      3986     MMMCMLXXXVI
      3987    MMMCMLXXXVII
      3988   MMMCMLXXXVIII
      3989     MMMCMLXXXIX
      3990         MMMCMXC
      3991        MMMCMXCI
      3992       MMMCMXCII
      3993      MMMCMXCIII
      3994       MMMCMXCIV
      3995        MMMCMXCV
      3996       MMMCMXCVI
      3997      MMMCMXCVII
      3998     MMMCMXCVIII
      3999       MMMCMXCIX
      4000 ###############
      4001 ###############
      4002 ###############
      4003 ###############
      4004 ###############
      4005 ###############
      4006 ###############
      4007 ###############
      4008 ###############
      4009 ###############
      4010 ###############
      4011 ###############
      4012 ###############
      4013 ###############
      4014 ###############
      4015 ###############
      4016 ###############
      4017 ###############
      4018 ###############
      4019 ###############
      4020 ###############
      4021 ###############
      4022 ###############
      4023 ###############
      4024 ###############
      4025 ###############
      4026 ###############
      4027 ###############
      4028 ###############
      4029 ###############
      4030 ###############
      4031 ###############
      4032 ###############
      4033 ###############
      4034 ###############
      4035 ###############
      4036 ###############
      4037 ###############
      4038 ###############
      4039 ###############
      4040 ###############
      4041 ###############
      4042 ###############
      4043 ###############
      4044 ###############
      4045 ###############
      4046 ###############
      4047 ###############
      4048 ###############
      4049 ###############
      4050 ############### 

You heard it here first, in Roman times 4,000 is equivalent to infinity.  An obvious extension to this rule is that in Roman times the value of PI had exactly 3999 digits to the right of the decimal point.   :-)





What is the Meaning of the %CPU Column in an Explain Plan? 2

1 01 2012

January 1, 2012

(Back to the Previous Post in the Series)

Nearly two years ago I posted the following execution plan in a blog article and asked whether or not there was anything strange about the %CPU column:

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   247 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  2236K|   247   (1)| 00:00:03 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access(:Z>=:Z AND :Z<=:Z)
       filter(("C1"<=10000 AND "C1">=1)) 

For nearly two years no one mentioned that it is strange that the row in the execution plan with ID 0 can have a Cost of 247 with a %CPU of 100, while that operation’s child operations can have the same Cost of 247 with a %CPU of 1.  I had long forgotten about the challenge that I posed in the blog article, until someone mentioned the execution plan in an OTN forum thread and asked about that particular oddity.

In the earlier article I demonstrated querying the PLAN_TABLE after using EXPLAIN PLAN FOR to determine how the %CPU column is calculated.  Unfortunately, I did not perform that step two years ago for the SQL statement that was used to generate the above execution plan, so the challenge remains.

As best as I am able to determine, the following is the table creation script:

DROP TABLE T1 PURGE;

CREATE TABLE T1 (
  C1 NUMBER NOT NULL,
  C2 NUMBER NOT NULL,
  C3 NUMBER NOT NULL,
  C4 NUMBER NOT NULL,
  C5 VARCHAR2(30) NOT NULL,
  C6 VARCHAR2(30) NOT NULL,
  FILLER VARCHAR2(200),
  PRIMARY KEY (C1));

INSERT INTO T1
SELECT
  ROWNUM,
  ROWNUM,
  TRUNC(ROWNUM/100+1),
  TRUNC(ROWNUM/100+1),
  CHR(65+TRUNC(ROWNUM/10000))||TRUNC(ROWNUM/100+1),
  CHR(65+TRUNC(ROWNUM/10000))||TRUNC(ROWNUM/100+1),
  LPAD('A',200,'A')
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT;

CREATE INDEX IND_T1_C3 ON T1(C3);
CREATE INDEX IND_T1_C4 ON T1(C4);
CREATE INDEX IND_T1_C5 ON T1(C5);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR COLUMNS SIZE 254 C2, C4, C6') 

I originally created the execution plan on an unpatched version of Oracle Database 11.2.0.1 running on 64 bit Linux.  The following is the SQL statement that was used to generate the execution plan:

SELECT /*+ PARALLEL(4) */
  *
FROM
  T1
WHERE
  C1 BETWEEN 1 AND 10000; 

Unfortunately, the above SQL statement generates a bit of a different looking execution plan when not hinted to perform a full table scan.  In an Oracle Database 11.2.0.2 database, the following execution plan appeared (although a similar one also appears for 11.2.0.1):

Plan hash value: 2275811211

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |   250 | 39750 |    17   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1           |   250 | 39750 |    17   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | SYS_C0041509 |   450 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1">=1 AND "C1"<=10000)

Note
-----
   - Degree of Parallelism is 1 because of hint 

The execution plan did not employ parallel execution even though it was hinted, because the calculated cost for the index access was less than the calculated cost for the parallel full table scan.  Note also that the cardinality estimate is incorrect  – this query will actually return 10,000 rows (for a side challenge, how is it possible that the index is projected to return 450 rows, when the parent operation is expected to return just 250 rows without a filter predicate applied at the parent operation?).

Let’s add a couple of hints to fix the cardinality issue (note that this particular hint is undocumented), and force the parallel table scan:

DELETE FROM PLAN_TABLE;

EXPLAIN PLAN FOR
SELECT /*+ PARALLEL(4) FULL(T1) CARDINALITY(T1 10000) */
  *
FROM
  T1
WHERE
  C1 BETWEEN 1 AND 10000;

SELECT
  ID,
  COST,
  IO_COST,
  COST-IO_COST DIFF,
  CEIL(DECODE(COST,0,0,(COST-IO_COST)/COST)*100) PER_CPU,
  CPU_COST
FROM
  PLAN_TABLE;

SET LINESIZE 120
SET PAGESIZE 1000

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY); 

On the unpatched 11.2.0.1 database, the above script produced the following output:

ID       COST    IO_COST       DIFF    PER_CPU   CPU_COST
-- ---------- ---------- ---------- ---------- ----------
 0        247        246          1          1    5958333
 1
 2        247        246          1          1    5958333
 3        247        246          1          1    5958333
 4        247        246          1          1    5958333

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2494645258

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          | 10000 |  1552K|   247   (1)| 00:00:03 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("C1">=1 AND "C1"<=10000)

Note
-----
   - Degree of Parallelism is 4 because of hint 

The above execution plan is similar to the execution plan at the start of this article (even the costs are identical), yet not exactly the same.  The Predicate Information section of the execution plan is missing access(:Z>=:Z AND :Z<=:Z) and the %CPU column shows a value of 1 from bottom to top, rather than 100 for the top row in the plan.  The output from the query of PLAN_TABLE exactly matches the above output from DBMS_XPLAN.DISPLAY function.  So, what happened, how did I generate the execution plan that appears at the start of this blog article?

You might be curious – are we able to force the first line in the execution plan to show 100 in the %CPU column?  Let’s try an experiment where we manually change the IO_COST value for the row in the PLAN_TABLE where the ID column value is equal to 0:

UPDATE
  PLAN_TABLE
SET
  IO_COST=0
WHERE
  ID=0;

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY); 

With the above change, this is the execution plan that I obtained:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2494645258

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          | 10000 |  1552K|   247 (100)| 00:00:03 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("C1">=1 AND "C1"<=10000)

Note
-----
   - Degree of Parallelism is 4 because of hint 

The first line in the above execution plan now shows 100 in the %CPU column – so if a 0 were present in that row/column combination in the PLAN_TABLE when I generated the execution plan at the start of this article, that might be one explanation for the strange output.  Note, however, that the Predicate Information section still does not match.  So, did I fake the execution plan through manipulation of PLAN_TABLE, or did I actually execute the SQL statement and pull from memory the actual execution plan:

SELECT /*+ PARALLEL(4) FULL(T1) CARDINALITY(T1 10000) */
  *
FROM
  T1
WHERE
  C1 BETWEEN 1 AND 10000;

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL')); 

When I executed the above, this is the execution plan that appeared:

SQL_ID  4r3zujtat35jb, child number 0
-------------------------------------
SELECT /*+ PARALLEL(4) FULL(T1) CARDINALITY(T1 10000) */   * FROM   T1
WHERE   C1 BETWEEN 1 AND 10000

Plan hash value: 2494645258

--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   247 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       | 10000 |  1552K|   247   (1)| 00:00:03 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access(:Z>=:Z AND :Z<=:Z)
       filter(("C1">=1 AND "C1"<=10000))

Note
-----
   - Degree of Parallelism is 4 because of hint 

Notice in the above execution plan that the %CPU column matches the values in the execution plan that appears at the start of this blog article.  Also note that the missing access(:Z>=:Z AND :Z<=:Z) now appears in the Predicate Information section.  Another possible case of explain plan lies?  I wonder what is the point of the :Z >= :Z and :Z <= :Z access predicate?  That is the equivalent of :Z BETWEEN :Z AND :Z, or 1 BETWEEN 1 AND 1 (assuming that the value of the :Z bind variable is not NULL), if you prefer.

Now that I have reproduced the execution plan at the start of this blog article (ignoring the predicted Bytes column), why did the DBMS_XPLAN function display a value of 100 in the %CPU column of the first line in the execution plan?  Is it caused by a bug in DBMS_XPLAN, a bug in the 11.2.0.1 optimizer, or something else?

Recall earlier in this blog article that I tried an experiment of changing the IO_COST for a row in PLAN_TABLE to a value of 0, and that change resulted in a value of 100 appearing in the %CPU column – does that mean that the IO_COST for the first row in the execution plan is 0?  In an execution plan, the costs shown for parent operations includes the cost shown for child operations…

In this case we cannot query the PLAN_TABLE, but we are able to query V$SQL_PLAN for the same type of information using the SQL_ID and Plan hash value that appeared in the above DBMS_XPLAN output:

SELECT
  ID,
  COST,
  IO_COST,
  COST-IO_COST DIFF,
  CPU_COST
FROM
  V$SQL_PLAN
WHERE
  SQL_ID='4r3zujtat35jb'
  AND PLAN_HASH_VALUE=2494645258
ORDER BY
  ID;

        ID       COST    IO_COST       DIFF   CPU_COST
---------- ---------- ---------- ---------- ----------
         0        247
         1
         2        247        246          1    5958333
         3        247        246          1    5958333
         4        247        246          1    5958333 

A NULL value for the IO_COST in execution plan line ID 0 (the first line in the execution plan)…

In the previous article, I proposed that the %CPU column is calculated with the following formula:

CEIL(DECODE(COST,0,0,(COST-IO_COST)/COST)*100) PER_CPU 

If the above were true, then a NULL would appear in the %CPU column for the first row.  It appears that I need to adjust the formula slightly to account for the possibility of a NULL value in the IO_COST column and handle it as if the value were 0 (who says that NULL is not equal to 0?).

CEIL(DECODE(COST,0,0,(COST-NVL(IO_COST,0))/COST)*100) PER_CPU

Later in the OTN thread the OP expressed concern about two SQL statements and their execution plans, where the “the cost per se of a single execution is low, but since the query is executed many number of times, the cpu consumption is hitting a high.”  Feel free to read the OTN thread to see my comments related to this concern.





BIN$ Index Found in the Execution Plan – Digging through the Recycle Bin

28 11 2011

November 28, 2011

There are a few articles that can be found on the Internet that describe the cause of indexes with names similar to BIN$ld5VAtb88PngQAB/AQF8hg==$0 and BIN$PESygWW5R0WhbOaDugxqwQ==$0 appearing in execution plans.  As is likely known by readers, the Oracle Database documentation describes that these object names are associated with the recycle bin that was introduced in Oracle Database 10.1.  When an object is dropped (but not purged), it is placed into the recycle bin with a name that begins with BIN$ and ends with ==$ followed by a number (the version, which in brief testing seems to always be 0).

I have answered this question a couple of times in the past in various Oracle Database forums, including a recent OTN thread.  What is the significance of having an index named, for instance, BIN$ld5VAtb88PngQAB/AQF8hg==$0 in an execution plan.  Does that mean that Oracle’s query optimizer has selected to use a dropped index?  No.  The simple answer is that the table to which the index belongs was dropped and then flashed back to before the drop.  When this happens, the table name is restored to its original name, but the names of the associated indexes are not restored.

A quick test case to demonstrate.  First, we will create a table with an index, and then collect statistics:

DROP TABLE T1 PURGE;

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  RPAD('A',255,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

COMMIT;

CREATE INDEX IND_T1_C1 ON T1(C1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE) 

Let’s put together a simple SQL statement that will hopefully use the index, and then confirm that the index was in fact used by displaying the actual execution plan used for the SQL statement:

SELECT
  C1,
  SUBSTR(C2,1,2)
FROM
  T1
WHERE
  C1<=5;

 C1 SU
--- --
  1 AA
  2 AA
  3 AA
  4 AA
  5 AA

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,NULL));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  g0kkvxqg3v145, child number 0
-------------------------------------
SELECT   C1,   SUBSTR(C2,1,2) FROM   T1 WHERE   C1<=5

Plan hash value: 683303157

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |     5 |  1300 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T1_C1 |     5 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"<=5) 

The above plan shows that the IND_T1_C1 index was used for this particular SQL statement.

Next, we will purge the recycle bin (make certain that there is nothing useful in the recycle bin first), drop the index, and see if it can still be used in an execution plan:

DROP INDEX IND_T1_C1;

SELECT
  C1,
  SUBSTR(C2,1,2)
FROM
  T1
WHERE
  C1<=5;

 C1 SU
--- --
  1 AA
  2 AA
  3 AA
  4 AA
  5 AA

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,NULL));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
SQL_ID  g0kkvxqg3v145, child number 0
-------------------------------------
SELECT   C1,   SUBSTR(C2,1,2) FROM   T1 WHERE   C1<=5

Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   137 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |     5 |  1300 |   137   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C1"<=5)

/* SHOW RECYCLEBIN  should be roughly equivalent to the following SQL statement */
SELECT
  OBJECT_NAME,
  ORIGINAL_NAME,
  TYPE,
  CREATETIME
FROM
  RECYCLEBIN
ORDER BY
  TYPE,
  ORIGINAL_NAME,
  OBJECT_NAME;

no rows selected 

From the above, it is apparent that an index that is dropped will not appear in an execution plan.  The dropped index is not in the recycle bin either.

Let’s recreate the index:

CREATE INDEX IND_T1_C1 ON T1(C1);

Then, using SQL statements similar to those at the start of this article (without the PURGE clause in the DROP TABLE statement), we will drop and recreate the table:

DROP TABLE T1;

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  RPAD('A',255,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

COMMIT;

CREATE INDEX IND_T1_C1 ON T1(C1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE) 

Next, let’s take a look at the recycle bin (note that we could simply execute SHOW RECYCLEBIN rather than execute the SQL statement, but the index would not be listed using that method):

SELECT
  OBJECT_NAME,
  ORIGINAL_NAME,
  TYPE,
  CREATETIME
FROM
  RECYCLEBIN
ORDER BY
  TYPE,
  ORIGINAL_NAME,
  OBJECT_NAME,
  CREATETIME;

OBJECT_NAME                    ORIGINAL_N TYPE  CREATETIME
------------------------------ ---------- ----- -------------------
BIN$cU4bWUSaSu2PUYdJvOq+hA==$0 IND_T1_C1  INDEX 2011-11-28:07:21:30
BIN$2smXLnTGTSqcBa8SJucvtg==$0 T1         TABLE 2011-11-28:07:21:30 

The above shows that we now have one table and its index in the recycle bin.  Let’s repeat the drop and recreate:

DROP TABLE T1;

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  RPAD('A',255,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

COMMIT;

CREATE INDEX IND_T1_C1 ON T1(C1);

EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE) 

Now let’s take a look at the recycle bin’s contents again:

SELECT
  OBJECT_NAME,
  ORIGINAL_NAME,
  TYPE,
  CREATETIME
FROM
  RECYCLEBIN
ORDER BY
  TYPE,
  ORIGINAL_NAME,
  OBJECT_NAME,
  CREATETIME;

OBJECT_NAME                    ORIGINAL_N TYPE  CREATETIME
------------------------------ ---------- ----- -------------------
BIN$cU4bWUSaSu2PUYdJvOq+hA==$0 IND_T1_C1  INDEX 2011-11-28:07:21:30
BIN$/40oC3RJSNiLmEESZ7VNEw==$0 IND_T1_C1  INDEX 2011-11-28:07:21:48
BIN$2smXLnTGTSqcBa8SJucvtg==$0 T1         TABLE 2011-11-28:07:21:30
BIN$nYId4wdGRf6IgpSXSDb4Kw==$0 T1         TABLE 2011-11-28:07:21:48 

The above output now shows that there are two tables and their associated indexes in the recycle bin.  Let’s recover one of those tables and its index:

FLASHBACK TABLE T1 TO BEFORE DROP;

Flashback complete. 

A quick check of the recycle bin shows that the most recently dropped table and its associated index are no longer in the recycle bin, but the older version of table T1 and its index are still in the recycle bin:

SELECT
  OBJECT_NAME,
  ORIGINAL_NAME,
  TYPE,
  CREATETIME
FROM
  RECYCLEBIN
ORDER BY
  TYPE,
  ORIGINAL_NAME,
  OBJECT_NAME,
  CREATETIME; 

OBJECT_NAME                    ORIGINAL_N TYPE  CREATETIME
------------------------------ ---------- ----- -------------------
BIN$cU4bWUSaSu2PUYdJvOq+hA==$0 IND_T1_C1  INDEX 2011-11-28:07:21:30
BIN$2smXLnTGTSqcBa8SJucvtg==$0 T1         TABLE 2011-11-28:07:21:30

Let’s re-execute the SQL statement that queries table T1:

SELECT
  C1,
  SUBSTR(C2,1,2)
FROM
  T1
WHERE
  C1<=5;

 C1 SU
--- --
  1 AA
  2 AA
  3 AA
  4 AA
  5 AA 

So, after recoving the table from the recycle bin, we are able to query the table.  Let’s take a look at the execution plan for this query:

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,NULL));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------
SQL_ID  g0kkvxqg3v145, child number 0
-------------------------------------
SELECT   C1,   SUBSTR(C2,1,2) FROM   T1 WHERE   C1<=5

Plan hash value: 1395723482

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                                |       |       |     3 (100)|       |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1                             |     5 |  1300 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BIN$/40oC3RJSNiLmEESZ7VNEw==$0 |     5 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"<=5) 

Notice in the above execution plan, the index name of BIN$/40oC3RJSNiLmEESZ7VNEw==$0 – that is what the index was named when it was sent to the recycle bin.  Let’s fix the odd BIN$ name and re-execute the query:

ALTER INDEX "BIN$/40oC3RJSNiLmEESZ7VNEw==$0" RENAME TO IND_T1_C1;

Index altered. 

SELECT
  C1,
  SUBSTR(C2,1,2)
FROM
  T1
WHERE
  C1<=5;

 C1 SU
--- --
  1 AA
  2 AA
  3 AA
  4 AA
  5 AA

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,NULL));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  g0kkvxqg3v145, child number 0
-------------------------------------
SELECT   C1,   SUBSTR(C2,1,2) FROM   T1 WHERE   C1<=5

Plan hash value: 683303157

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |     5 |  1300 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T1_C1 |     5 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"<=5)

So, the above shows how a BIN$ named index might appear in an execution plan, and how to fix the name.

We still have one table and its index in the recycle bin.  Let’s take a quick look at that table:

SELECT
  C1,
  SUBSTR(C2,1,2)
FROM
  "BIN$2smXLnTGTSqcBa8SJucvtg==$0"
WHERE
  C1<=5;

 C1 SU
--- --
  1 AA
  2 AA
  3 AA
  4 AA
  5 AA 

We are able to query the table that is in the recycle bin, as long as we enclose the table name (OBJECT_NAME in the query of RECYCLEBIN) in quotation marks ().  Let’s take a look at the execution plan for the previous SQL statement:

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,NULL));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------
SQL_ID  a2a2vcsbtw5ac, child number 0
-------------------------------------
SELECT   C1,   SUBSTR(C2,1,2) FROM   "BIN$2smXLnTGTSqcBa8SJucvtg==$0"
WHERE   C1<=5

Plan hash value: 3681245720

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                                |       |       |     3 (100)|       |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIN$2smXLnTGTSqcBa8SJucvtg==$0 |     5 |  1300 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BIN$cU4bWUSaSu2PUYdJvOq+hA==$0 |     5 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"<=5) 

Now we have both a BIN$ prefixed table and index in the execution plan.

Let’s recover the old version of the T1 table (renaming it when it is recovered) and rename its associated recovered index:

FLASHBACK TABLE T1 TO BEFORE DROP RENAME TO T1_OLD;

Flashback complete.

ALTER INDEX "BIN$cU4bWUSaSu2PUYdJvOq+hA==$0" RENAME TO IND_T1_OLD_C1;

Index altered. 

Let’s query the recovered table (now called T1_OLD) and check the execution plan:

SELECT
  C1,
  SUBSTR(C2,1,2)
FROM
  T1_OLD
WHERE
  C1<=5;

 C1 SU
--- --
  1 AA
  2 AA
  3 AA
  4 AA
  5 AA

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,NULL));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
SQL_ID  bv1ty7jq2hc5g, child number 0
-------------------------------------
SELECT   C1,   SUBSTR(C2,1,2) FROM   T1_OLD WHERE   C1<=5

Plan hash value: 3358254750

---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1_OLD        |     5 |  1300 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_T1_OLD_C1 |     5 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"<=5) 

Now let’s make certain that there is nothing in the recycle bin:

SELECT
  OBJECT_NAME,
  ORIGINAL_NAME,
  TYPE,
  CREATETIME
FROM
  RECYCLEBIN
ORDER BY
  TYPE,
  ORIGINAL_NAME,
  OBJECT_NAME,
  CREATETIME;

no rows selected 

The experimental tables and indexes are not in the recycle bin.  Let’s drop the experimental tables that we recovered from the recycle bin, this time skipping the recycle bin:

DROP TABLE T1 PURGE;
DROP TABLE T1_OLD PURGE; 

Just to confirm that the tables and their associated indexes are not in the recycle bin:

SELECT
  OBJECT_NAME,
  ORIGINAL_NAME,
  TYPE,
  CREATETIME
FROM
  RECYCLEBIN
ORDER BY
  TYPE,
  ORIGINAL_NAME,
  OBJECT_NAME,
  CREATETIME;

no rows selected 

—–

Hopefully, by now everyone is aware of those BIN$ prefixed object names that might appear in execution plans, what causes the names to be created, and how to fix the names.





Why Isn’t My Index Used… When USER2 Executes this Query?

23 11 2011

November 23, 2011

I previously wrote a couple of articles that mention reasons why an index might not be use for a particular query, including an article that was formatted as a True or False quiz with several reference articles.  A few days ago I saw an OTN thread that caught my curiosity, where the original poster (OP) claimed that the optimizer simply will not use an index to access a table when any user other than the schema owner or the SYS user executes a particular query.

Why is the OP attempting to execute the SQL statement as the SYS user?  The SYS user is special.  As mentioned in my review of the book “Practical Oracle 8i“, as I read the book I wrote the following paraphrase into my notes, the SYS user is special:

Oracle 8i introduces row level security, which uses a PL/SQL function to apply an additional WHERE clause predicate to a table – row level security does not apply to the SYS user. It is important to use CONSISTENT=Y when exporting partitioned tables. When CONSISTENT=N is specified, the export of each partition in a table is treated as a separate transaction, and may be exported at a different SCN number (incremented when any session commits). When tables are exported which contain nested tables, the two physical segments are exported in separate transactions, potentially resulting in inconsistent data during the import if the export was performed with the default CONSISTENT=N.

Is the above paraphrase from this 10 year old book a clue?  Maybe it is a problem related to secure view merging because the SQL statement uses the index when the schema owner executes the SQL statement (there is a very good example of this type of problem found in the book “Troubleshooting Oracle Performance“).  Maybe it is a problem where the public synonym for the table actually points to a view or an entirely different table – the execution plan for the non-schema owner did show a VIEW operation, while the execution plan for the schema owner did not show the VIEW operation.  Maybe it is a problem where the optimizer parameters are adjusted differently for different users – in such a case we might need to dig into the V$SYS_OPTIMIZER_ENV, V$SES_OPTIMIZER_ENV, and V$SQL_OPTIMIZER_ENV views.

Maybe taking a look at the DBMS_XPLAN output would help.  Why does the Predicate Information section of the execution plan show the following only for the non-schema owner?

7 - filter(("SEAL_FLAG" IS NULL OR "SEAL_FLAG"<>'Y'))
9 - filter(("SEAL_FLAG" IS NULL OR "SEAL_FLAG"<>'Y'))
11 - filter(("SEAL_FLAG"<>'Y' OR "SEAL_FLAG" IS NULL))
13 - filter(("SEAL_FLAG"<>'Y' OR "SEAL_FLAG" IS NULL))
19 - filter(("SEAL_FLAG"<>'Y' OR "SEAL_FLAG" IS NULL)) 

A significant clue?  If those predicates were also found in the DBMS_XPLAN generated output for the schema owner (and the SYS user), I would probably conclude that the optimizer generated those additional predicates from defined column constraints, and that a review of a 10053 trace file might help determine what caused those predicates to be automatically created.  However, those predicates did not appear in the execution plan that was generated for the schema owner.  It might be time to start checking the V$VPD_POLICY view for this particular SQL_ID, for example (a completely unrelated test case output):

SELECT
  *
FROM
  V$VPD_POLICY
WHERE
  SQL_ID='6hqw5p9d8g8wf';

ADDRESS          PARADDR            SQL_HASH SQL_ID        CHILD_NUMBER OBJECT_OWNER OBJECT_NAME                    POLICY_GROUP                   POLICY                 POLICY_FUNCTION_OWNER          PREDICATE
---------------- ---------------- ---------- ------------- ------------ ------------ ------------------------------ ------------------------------ ---------------------- ------------------------------ ------------------------------------------------------------------------------------
000007FFB7701608 000007FFB7743350 1518838670 6hqw5p9d8g8wf            0 TESTUSER     T12                            SYS_DEFAULT                    T_SEC                  TESTUSER                       ID < 10 

Maybe we should also check some of the other virtual private database (VPD) related views including ALL_POLICIES (once again from a completely unrelated test case):

SELECT
  *
FROM
  ALL_POLICIES;

OBJECT_OWNER                   OBJECT_NAME                    POLICY_GROUP                  POLICY_NAME                    PF_OWNER                       PACKAGE                       FUNCTION                       SEL INS UPD DEL IDX CHK ENA STA POLICY_TYPE              LON
------------------------------ ------------------------------ ----------------------------- ------------------------------ ------------------------------ ----------------------------- ------------------------------ --- --- --- --- --- --- --- --- ------------------------ ---
TESTUSER                       T12                            SYS_DEFAULT                   T_SEC                          TESTUSER                       S                                                            YES YES YES YES NO  NO  YES NO  DYNAMIC                  NO 

There are known performance problems related to the use of VPD, some of which are Oracle Database version dependent, and some of which have been corrected in recent versions.  Maybe a quick check of one of the following articles would help, if the OP finds that VPD is in fact in use (the second article provides step by step directions for investigation):

  • Metalink (MOS) Doc ID 728292.1 “Known Performance Issues When Using TDE and Indexes on the Encrypted Columns”
  • Metalink (MOS) Doc ID 967042.1 “How to Investigate Query Performance Regressions Caused by VPD (FGAC) Predicates?”

Take a look at the OTN thread.  Any other suggestions for the OP?





Select For Update – In What Order are the Rows Locked?

21 11 2011

November 21, 2011

A recent thread in the comp.databases.oracle.server usenet group asked whether or not a SELECT FOR UPDATE statement locks rows in the order specified by the ORDER BY clause.  Why might this be an important question?  Possibly if the SKIP LOCKED clause is implemented in the SELECT FOR UPDATE statement?  Possibly if a procedure is hanging, and it is important to know at what point an enqueue happened?  Any other reasons?

Without testing, I would have suggested that the rows are locked as the rows are read from the table blocks, and not after the ORDER BY clause alters the order of the rows.  Why?  Oracle Database is fundamentally lazy, or put another way, fundamentally efficient – in general it does not perform unnecessary work.  Locking the rows after applying the ORDER BY clause would require a second visit to the table blocks (possibly having to visit each block multiple times to lock different rows in the same block) in the order specified by the ORDER BY clause.  Such an approach could be incredibly inefficient and also error prone (what happens if a row was locked by a second session while the first session was sorting the rows per the ORDER BY clause?).

We could guess, but why guess when we are able to easily test the theory?  Let’s create a simple table with 10,000 rows:

DROP TABLE T1 PURGE;

CREATE TABLE T1 AS
SELECT
  ROWNUM C1,
  RPAD('A',255,'A') C2
FROM
  DUAL
CONNECT BY
  LEVEL<=10000;

COMMIT; 

Now we need 2 sessions (I will call them Session 1 and Session 2).  We will execute the same SELECT FOR UPDATE statement in both sessions, with Session 1 first selecting the table rows in ascending order and then Session 2 selecting the table rows in descending order.  If the ORDER BY clause determines the order in which the rows are locked, the row with a C1 value of 10,000 should be identified as the row that caused the enqueue because that is the first row that should be returned to Session 2.

In Session 1:

SELECT
  C1,
  C2
FROM
  T1
WHERE
  MOD(C1,100)=0
ORDER BY
  C1
FOR UPDATE; 

In Session 2:

SELECT
  C1,
  C2
FROM
  T1
WHERE
  MOD(C1,100)=0
ORDER BY
  C1 DESC
FOR UPDATE; 

(Session 2 is hung)

Let’s try to identify the row that Session 2 is waiting to lock.  There are a couple of ways to accomplish this task (on Oracle Database 10.1 and higher the join to the V$SESSION_WAIT view is unnecessary because that information is found in V$SESSION).

In Session 1:

COLUMN EVENT FORMAT A20
COLUMN OBJ# FORMAT 999999
COLUMN BLOCK# FORMAT 999999
COLUMN ROW# FORMAT 99

SELECT
  SW.EVENT,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

EVENT                   OBJ#      FILE#  BLOCK# ROW#
-------------------- ------- ---------- ------- ----
enq: TX - row lock c   71913          4    4262   18
ontention 

We have the object ID (71913 – technically we need the DATA_OBJECT_ID, not the OBJECT_ID for the DBMS_ROWID.ROWID_CREATE call, but the two values will be identical in this test case), absolute file number (4), block (4262), and row (18) in the block that caused the enqueue.  Let’s select that row from the table:

SELECT
  C1
FROM
  T1
WHERE
  ROWID=DBMS_ROWID.ROWID_CREATE(1, 71913, 4, 4262, 18);

 C1
---
100 

The row with a C1 value of 100 is the first row that was returned by Session 1 (it will be the last row requested to be returned by Session 2), but it is also the first row that matched the WHERE clause predicates for Session 2’s SQL statement as the blocks were read (you could confirm the order in which the blocks are read by flushing the buffer cache before the select, setting the DB_FILE_MULTIBLOCK_READ_COUNT parameter at the session level to 1, enabling a 10046 trace, and then executing the SELECT FOR UPDATE statement).

Let’s retrieve the execution plan for Session 2 to determine why the original poster (OP) might have throught that the rows were locked after the ORDER BY is applied (note that the SQL_ID and SQL_CHILD_NUMBER columns only exist in V$SESSION as of Oracle Database 10.1, so technically the join to V$SESSION_WAIT is unnecessary; however, for consistency with the previous SQL statement that determined the locked row, the join to the V$SESSION_WAIT view is included):

SELECT
  S.SQL_ID,
  S.SQL_CHILD_NUMBER
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

SQL_ID        SQL_CHILD_NUMBER
------------- ----------------
2dnpymtj0rc1r                0 

Now retrieving the execution plan:

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR('2dnpymtj0rc1r',0,'TYPICAL'));

SQL_ID  2dnpymtj0rc1r, child number 0
-------------------------------------
SELECT   C1,   C2 FROM   T1 WHERE   MOD(C1,100)=0 ORDER BY   C1 DESC
FOR UPDATE

Plan hash value: 3991553210

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |    33 (100)|          |
|   1 |  FOR UPDATE         |      |       |       |            |          |
|   2 |   SORT ORDER BY     |      |   123 | 17466 |    33   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   123 | 17466 |    32   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(MOD("C1",100)=0)

Note
-----
   - dynamic sampling used for this statement (level=2) 

As mentioned by the OP, the execution plan appears to be slightly misleading – unless of course you remember that locking the rows after sorting the rows based on the ORDER BY clause  would require revisiting the rows in the table blocks.  The situation could be different in this test case if there was an index on column C1.  In such a case the index could be read in descending order, thus making it appear that the rows were attempted to be locked in the order described by the ORDER BY clause.

Let’s slightly adapt the original test case to test the second theory, that it could appear that the rows are locked in the order specified by the ORDER BY clause.

In Session 1:

ROLLBACK;

In Session 2:

ROLLBACK;

Now we will create an index on column C1.

In Session 1:

CREATE INDEX IND_T1_C1 ON T1(C1);

Now the revised test begins.

In Session 1:

SELECT /*+ INDEX(T1) */
  C1,
  C2
FROM
  T1
WHERE
  C1<=100
ORDER BY
  C1
FOR UPDATE;

In Session 2:

SELECT /*+ INDEX(T1) */
  C1,
  C2
FROM
  T1
WHERE
  C1<=100
ORDER BY
  C1 DESC
FOR UPDATE;

(Session 2 is hung)

Let’s try to identify the row that Session 2 is waiting to lock.

In Session 1:

COLUMN EVENT FORMAT A20
COLUMN OBJ# FORMAT 999999
COLUMN BLOCK# FORMAT 999999
COLUMN ROW# FORMAT 99

SELECT
  SW.EVENT,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

EVENT                   OBJ#      FILE#  BLOCK# ROW#
-------------------- ------- ---------- ------- ----
enq: TX - row lock c   71913          4    4259    0
ontention 

Notice in the above output, now block number 4259 is identified, while in the earlier test script block number 4262 was identified by the above SQL statement.  We have the object ID (we actually need the DATA_OBJECT_ID, but the values will be the same in this test case), the absolute file number, the block number, and the row number in the block (0 in this case).  Let’s select that row from the table:

SELECT
  C1
FROM
  T1
WHERE
  ROWID=DBMS_ROWID.ROWID_CREATE(1, 71913, 4, 4259, 0);

 C1
---
 1 

The row with a C1 value of 1 is the first row that was returned by Session 1 (it will be the last row requested to be returned by Session 2), but it is also the first row that matched the WHERE clause predicates for Session 2’s SQL statement as the blocks were read.  This is the same result as we saw before – maybe it does not matter whether an index access path is used to avoid the sort operation that would be otherwise required to satisfy the ORDER BY clause – is the second theory false?  Before making that determination, let’s take a look at the execution plan for Session 2’s SQL statement:

SELECT
  S.SQL_ID,
  S.SQL_CHILD_NUMBER
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

SQL_ID        SQL_CHILD_NUMBER
------------- ----------------
3yz7pu7rw5cw0                0

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR('3yz7pu7rw5cw0',0,'TYPICAL'));

Plan hash value: 3432103074

-------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           |       |       |     7 (100)|          |
|   1 |  FOR UPDATE                   |           |       |       |            |          |
|   2 |   SORT ORDER BY               |           |   100 | 14200 |     7  (15)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1        |   100 | 14200 |     6   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IND_T1_C1 |   100 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("C1"<=100)

Note
-----
   - dynamic sampling used for this statement (level=2)

The index that we created was definitely used, but notice that there is still a SORT ORDER BY operation in the execution plan.  The rows in the index were read in ascending order, not descending order!  Let’s try again using an INDEX_DESC hint in the SQL statement for Session 2.

In Session 1:

ROLLBACK;

In Session 2:

ROLLBACK;

In Session 1:

SELECT /*+ INDEX(T1) */
  C1,
  C2
FROM
  T1
WHERE
  C1<=100
ORDER BY
  C1
FOR UPDATE;

In Session 2:

SELECT /*+ INDEX_DESC(T1) */
  C1,
  C2
FROM
  T1
WHERE
  C1<=100
ORDER BY
  C1 DESC
FOR UPDATE;

(Session 2 is hung)

Let’s determine the row that Session 2 is waiting to lock.

In Session 1:

SELECT
  SW.EVENT,
  S.ROW_WAIT_OBJ# OBJ#,
  S.ROW_WAIT_FILE# FILE#,
  S.ROW_WAIT_BLOCK# BLOCK#,
  S.ROW_WAIT_ROW# ROW#
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

EVENT                   OBJ#      FILE#  BLOCK# ROW#
-------------------- ------- ---------- ------- ----
enq: TX - row lock c   71913          4    4262   18
ontention

SELECT
  C1
FROM
  T1
WHERE
  ROWID=DBMS_ROWID.ROWID_CREATE(1, 71913, 4, 4262, 18);

 C1
---
100

The above output now shows that Session 2 is attempting to lock the first row (with a C1 value of 100) that it intends to return.  Has the execution plan changed?

SELECT
  S.SQL_ID,
  S.SQL_CHILD_NUMBER
FROM
  V$SESSION_WAIT SW,
  V$SESSION S
WHERE
  S.USERNAME IS NOT NULL
  AND SW.SID=S.SID
  AND SW.EVENT NOT LIKE '%SQL*Net%'
  AND SW.EVENT NOT IN ('Streams AQ: waiting for messages in the queue', 'wait for unread message on broadcast channel');

SQL_ID        SQL_CHILD_NUMBER
------------- ----------------
806mtjxk7k1dv                0

SELECT
  *
FROM
  TABLE(DBMS_XPLAN.DISPLAY_CURSOR('806mtjxk7k1dv',0,'TYPICAL'));

Plan hash value: 3814195162

--------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |       |       |     6 (100)|          |
|   1 |  FOR UPDATE                    |           |       |       |            |          |
|   2 |   BUFFER SORT                  |           |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID | T1        |   100 | 14200 |     6   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| IND_T1_C1 |   100 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("C1"<=100)

Note
-----
   - dynamic sampling used for this statement (level=2) 

Notice in the above execution plan that there is no longer a SORT ORDER BY operation in the execution plan (replaced by a BUFFER SORT operation) and the INDEX RANGE SCAN operation was also replaced by an INDEX RANGE SCAN DESCENDING operation.  Simply because the index range scan is performed in the same order as specified by the ORDER BY clause, the rows are locked in the same order as is specified by the ORDER BY clause – that is, after all, the order in which the rows were touched.

Any other ideas for a demonstration of the order in which rows are locked when a SELECT FOR UPDATE is used?





Matching the Expected Output – Analytic RANK, ROW_NUMBER, DENSE_RANK, or Something Different?

16 11 2011

November 16, 2011

I noticed an interesting request on the comp.databases.oracle.misc usenet group, and I thought that I would make an attempt at solving the request.  The original poster (OP) stated that he had a table with data similar to the following:

TX   ID   DEPT   LOCATION   LOAD
1    99    A        NY       12
2    99    A        LA       10
3    99    B        LA       05
4    77    B        LA       15
5    77    C        NY       12
6    77    D        LA       11  

He would like to obtain the following output:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11  

The rankings are to be determined as follows:

DEPT_RANK for ID 99 is 1 for A because sum(LOAD) = 22 is the max
LOC_RANK for ID 99 is 1 for LA because sum(LOAD) = 15 is the max

At first glance, that request seems to be reasonably easy to accomplish.  Let’s start by creating a table with the sample data (ideally, the OP should have provided the DDL and DML to create and populate this table):

DROP TABLE T1 PURGE;

CREATE TABLE T1 (
  TX NUMBER,
  ID NUMBER,
  DEPT VARCHAR2(1),
  LOCATION VARCHAR2(2),
  LOAD NUMBER);

INSERT INTO T1 VALUES (1,99,'A','NY',12);
INSERT INTO T1 VALUES (2,99,'A','LA',10);
INSERT INTO T1 VALUES (3,99,'B','LA',05);
INSERT INTO T1 VALUES (4,77,'B','LA',15);
INSERT INTO T1 VALUES (5,77,'C','NY',12);
INSERT INTO T1 VALUES (6,77,'D','LA',11); 

The first step, if we were to think about creating the solution in logical steps, is to find a way to calculate the SUM values that were mentioned by the OP.  So, as a starting point, we might try this:

SELECT
  TX,
  ID,
  DEPT,
  SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
  LOCATION,
  SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
  LOAD
FROM
  T1
ORDER BY
  TX;

 TX  ID D SUM_LOAD_ID LO SUM_LOAD_LOCATION  LOAD
--- --- - ----------- -- ----------------- -----
  1  99 A          22 NY                12    12
  2  99 A          22 LA                15    10
  3  99 B           5 LA                15     5
  4  77 B          15 LA                26    15
  5  77 C          12 NY                12    12
  6  77 D          11 LA                26    11 

If I am understanding the OP’s request correctly, the above is a good starting point (even though the alias for the first analytic function could have been better selected).

We are then able to take the above SQL statement and push it into an inline view to hopefully produce the output that is expected by the OP  (note that the PARTITION clause differs for the LOC_RANK column from what is specified in the inline view for the function that is used to create that column):

SELECT
  TX,
  ID,
  DEPT,
  RANK() OVER (PARTITION BY ID,DEPT ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  RANK() OVER (PARTITION BY ID,DEPT ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          2    12
  2  99 A          1 LA          1    10
  3  99 B          1 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          1 NY          1    12
  6  77 D          1 LA          1    11 

Let’s compare the above output with what the OP requested:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11

Almost for the LOC_RANK column (not even close for the DEPT_RANK column), but not quite right.  The problem is that when attempting to calculate the RANK columns in the above output, we should only PARTITION on the ID column, not the ID column and some other column, as was the case when we used the SUM analytic function.

Let’s fix the PARTITION clause and try again:

SELECT
  TX,
  ID,
  DEPT,
  RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          3    12
  2  99 A          1 LA          1    10
  3  99 B          3 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          2 NY          3    12
  6  77 D          3 LA          1    11 

Let’s compare the above output with what the OP requested:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11

Notice in the above that when two rows have the same SUM_LOAD_ value, the displayed rank is correct, but that repeated rank value then causes a rank value to be skipped (compare the DEPT_RANK column value on row 3, the LOC_RANK column value on row 1, and the LOC_RANK column value on row 5).

Now what?  The ROW_NUMBER function could be used to produce sequential rank numbers without gaps, for example:

SELECT
  TX,
  ID,
  DEPT,
  ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          3    12
  2  99 A          2 LA          2    10
  3  99 B          3 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          2 NY          3    12
  6  77 D          3 LA          2    11 

The above output, as mentioned, does not match the output requested by the OP, since the OP’s requested output specifes that equal values for different rows should show the same rank value.

One more try using the DENSE_RANK analytic function:

SELECT
  TX,
  ID,
  DEPT,
  DENSE_RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
  LOCATION,
  DENSE_RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC) LOC_RANK,
  LOAD
FROM
  (SELECT
    TX,
    ID,
    DEPT,
    SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
    LOCATION,
    SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
    LOAD
  FROM
    T1)
ORDER BY
  TX;

 TX  ID D  DEPT_RANK LO   LOC_RANK  LOAD
--- --- - ---------- -- ---------- -----
  1  99 A          1 NY          2    12
  2  99 A          1 LA          1    10
  3  99 B          2 LA          1     5
  4  77 B          1 LA          1    15
  5  77 C          2 NY          2    12
  6  77 D          3 LA          1    11 

Let’s compare the above output with what the OP requested:

TX   ID   DEPT   DEPT_RANK   LOCATION   LOC_RANK   LOAD
1    99    A         1          NY         2        12
2    99    A         1          LA         1        10
3    99    B         2          LA         1        05
4    77    B         1          LA         1        15
5    77    C         2          NY         2        12
6    77    D         3          LA         1        11

As best as I am able to determine, the above SQL statement will satisfy the OP’s request.

—–

Part 2 of the Challenge

If the OP has the following SQL statement:

SELECT
  LOAD_YEAR,
  ORG_UNIT_CODE,
  RANK () OVER (PARTITION BY LOAD_YEAR, CLASSIF_CODE ORDER BY SUM (FTE_DAYS) DESC) ORG_RANK,
  CLASSIF_CODE,
  RANK () OVER (PARTITION BY LOAD_YEAR, ORG_UNIT_CODE ORDER BY SUM (FTE_DAYS) DESC) CLASSIF_RANK,
  SUM (FTE_DAYS) FTE
FROM
  STAFF_LOAD
GROUP BY
  LOAD_YEAR,
  ORG_UNIT_CODE,
  CLASSIF_CODE;  

And the above SQL statement produces the following output:

YEAR ORG_UNIT_CODE ORG_RANK CLASSIF_CODE CLASSIF_RANK FTE
2010 A46                  1 HEW3                    1  59
2010 A42                  2 HEW3                    1  13
2010 A42                  1 HEW4                    1  13
2010 A46                  2 HEW4                    2  12

And the OP wants the output to look like this:

YEAR ORG_UNIT_CODE ORG_RANK CLASSIF_CODE CLASSIF_RANK FTE
2010 A46                  1 HEW3                    1  59
2010 A42                  2 HEW3                    1  13
2010 A42                  2 HEW4                    2  13
2010 A46                  1 HEW4                    2  12  

Write the DDL and DML statements to create the source table and populate it with the non-aggregated original data, and then produce the output requested by the OP (without looking at the updated usenet thread).








Follow

Get every new post delivered to your Inbox.

Join 142 other followers