March 18, 2010
(Forward to the Next Post in the Series)
This example is based on a demonstration that I gave during a presentation last year. I did not go into great detail how the code worked, but I demonstrated that a carefully constructed Cartesian join is helpful and efficient for solutions to certain types of problems. Assume that you have a table named APPLICATION_LIST that lists all of the modules belonging to an application, another table named USER_LIST that lists each Oracle username that has access to the application, and a third table named USER_PROGRAM_PERMISSION that lists each username that is denied access to one of the application modules. The table construction may seem a little odd, but this is based on an actual example found in a commercial product. The goal is to produce a cross-tab style report that shows all users’ permissions to all of the application modules, and have that cross-tab report appear in Excel. The table definitions for our test tables look like this:
CREATE TABLE APPLICATION_LIST( PROGRAM_ID VARCHAR2(30), MENU_STRING VARCHAR2(30), PRIMARY KEY (PROGRAM_ID)); CREATE TABLE USER_LIST( NAME VARCHAR2(30), TYPE NUMBER, PRIMARY KEY(NAME)); CREATE TABLE USER_PROGRAM_PERMISSION( USER_ID VARCHAR2(30), PROGRAM_ID VARCHAR2(30), PERMISSION CHAR(1), PROGRAM_COMPONENT VARCHAR(20), PRIMARY KEY(USER_ID,PROGRAM_ID));
We will populate the test tables with the following script:
INSERT INTO APPLICATION_LIST SELECT DBMS_RANDOM.STRING('Z',10) PROGRAM_ID, DBMS_RANDOM.STRING('A',20) MENU_STRING FROM DUAL CONNECT BY LEVEL<=100; INSERT INTO USER_LIST SELECT 'USER'||TO_CHAR(ROWNUM) USER_ID, 1 TYPE FROM DUAL CONNECT BY LEVEL<=300; INSERT INTO USER_PROGRAM_PERMISSION SELECT USER_ID, PROGRAM_COMPONENT, PERMISSION, 'PROGRAM' FROM (SELECT UL.NAME USER_ID, AL.PROGRAM_ID PROGRAM_COMPONENT, 'N' PERMISSION FROM USER_LIST UL, APPLICATION_LIST AL ORDER BY DBMS_RANDOM.VALUE) WHERE ROWNUM<=27000; COMMIT; EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'APPLICATION_LIST',CASCADE=>TRUE) EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'USER_LIST',CASCADE=>TRUE) EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'USER_PROGRAM_PERMISSION',CASCADE=>TRUE)
The first VBS script will not use a Cartesian Merge join – instead it will retrieve a list of all users and all application modules, and then probe the USER_PROGRAM_PERMISSION table once for each USER_ID. (IntentionalCartesian1-NoCartesian.VBS – save as IntentionalCartesian1-NoCartesian.VBS)
Const adVarChar = 200 Const adCmdText = 1 Const adCmdStoredProc = 4 Const adParamInput = 1 Dim i Dim j Dim strSQL Dim strLastColumn Dim strProgramName Dim strUsername Dim strPassword Dim strDatabase Dim strPermission Dim snpData Dim comData Dim dbDatabase Dim objExcel Set dbDatabase = CreateObject("ADODB.Connection") Set snpData = CreateObject("ADODB.Recordset") Set comData = CreateObject("ADODB.Command") On Error Resume Next strUsername = "MyUsername" strPassword = "MyPassword" strDatabase = "MyDB" dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";" dbDatabase.Open 'Should verify that the connection attempt was successful, but I will leave that for someone else to code Set snpData = CreateObject("adodb.recordset") 'Create an Excel connection Set objExcel = CreateObject("Excel.Application") With objExcel .Workbooks.Add .ActiveWorkbook.Sheets.Add .ActiveSheet.Name = "Application Permissions" 'Remove the three default worksheets For i = 1 To .ActiveWorkbook.Sheets.Count If .ActiveWorkbook.Sheets(i).Name = "Sheet1" Then .Sheets("Sheet1").Select .ActiveWindow.SelectedSheets.Delete End If If .ActiveWorkbook.Sheets(i).Name = "Sheet2" Then .Sheets("Sheet2").Select .ActiveWindow.SelectedSheets.Delete End If If .ActiveWorkbook.Sheets(i).Name = "Sheet3" Then .Sheets("Sheet3").Select .ActiveWindow.SelectedSheets.Delete End If Next .Visible = True End With strSQL = "SELECT" & vbCrLf strSQL = strSQL & " PROGRAM_ID," & vbCrLf strSQL = strSQL & " MENU_STRING" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " APPLICATION_LIST" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " PROGRAM_ID NOT IN ('.SEPARATOR')" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " MENU_STRING" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then strProgramName = snpData.GetRows(30000) snpData.Close End If 'Set the number of elements in the strPermission array to match the number of application module names ReDim strPermission(UBound(strProgramName, 2)) 'Copy the module names into Excel For j = 0 To UBound(strPermission) strPermission(j) = strProgramName(1, j) ' & " - " & strProgramName(0, j) Next With objExcel .Application.ScreenUpdating = False .ActiveSheet.Range(.ActiveSheet.Cells(1, 2), .ActiveSheet.Cells(1, 1 + UBound(strProgramName, 2))) = strPermission End With 'Retrieve the list of users strSQL = "SELECT" & vbCrLf strSQL = strSQL & " NAME" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " USER_LIST" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " TYPE=1" & vbCrLf strSQL = strSQL & " AND NAME NOT LIKE '%#'" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " NAME" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then strUsername = snpData.GetRows(30000) snpData.Close End If 'Set the SQL statement to use to retrieve permissions, ? is a bind variable placeholder strSQL = "SELECT" & vbCrLf strSQL = strSQL & " PROGRAM_ID," & vbCrLf strSQL = strSQL & " PERMISSION" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " USER_PROGRAM_PERMISSION" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " PROGRAM_COMPONENT='PROGRAM'" & vbCrLf strSQL = strSQL & " AND USER_ID= ?" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " PROGRAM_ID" With comData 'Set up the command properties .CommandText = strSQL .CommandType = adCmdText .CommandTimeout = 30 .ActiveConnection = dbDatabase .Parameters.Append .CreateParameter("user_id", adVarChar, adParamInput, 30, "") End With 'Loop through each user For i = 0 To UBound(strUsername, 2) 'Reset the permissions for the next user For j = 0 To UBound(strPermission) strPermission(j) = "Y" Next comData("user_id") = strUsername(0, i) Set snpData = comData.Execute If snpData.State = 1 Then Do While Not (snpData.EOF) For j = 0 To UBound(strProgramName, 2) If strProgramName(0, j) = snpData("program_id") Then strPermission(j) = snpData("permission") Exit For End If Next snpData.MoveNext Loop snpData.Close End If With objExcel .ActiveSheet.Cells(i + 2, 1) = strUsername(0, i) .ActiveSheet.Range(.ActiveSheet.Cells(i + 2, 2), .ActiveSheet.Cells(i + 2, 1 + UBound(strProgramName, 2))) = strPermission End With Next 'Convert the number of columns into letter notation strLastColumn = Chr(64 + Int((UBound(strProgramName, 2)) / 26)) & Chr(64 + ((UBound(strProgramName, 2)) Mod 26 + 1)) 'Final cleanup With objExcel .ActiveSheet.Range(.ActiveSheet.Cells(1, 2), .ActiveSheet.Cells(1, 1 + UBound(strProgramName, 2))).Orientation = 90 .ActiveSheet.Columns("A:" & strLastColumn).AutoFit .Application.ScreenUpdating = True .ActiveSheet.Range("B2").Select .ActiveWindow.FreezePanes = True End With dbDatabase.Close Set snpData = Nothing Set comData = Nothing Set objExcel = Nothing
If you ignore the fact that the above script redefines the meaning of the strUsername variable, the script works. The problem with the script is that it repeatedly sends queries to the database, and probably should be optimized to remove the repeated queries (the number of repeated communication to the database server could have been much worse). Let’s take a look at version 2 of the script (IntentionalCartesian2-NoCartesian.VBS – save as IntentionalCartesian2-NoCartesian.VBS)
Dim i Dim j Dim strSQL Dim strLastColumn Dim strProgramName Dim strEmployeename Dim strUsername Dim strPassword Dim strDatabase Dim strPermission Dim snpData Dim dbDatabase Dim objExcel Set dbDatabase = CreateObject("ADODB.Connection") Set snpData = CreateObject("ADODB.Recordset") On Error Resume Next strUsername = "MyUsername" strPassword = "MyPassword" strDatabase = "MyDB" dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";" dbDatabase.Open 'Should verify that the connection attempt was successful, but I will leave that for someone else to code Set snpData = CreateObject("adodb.recordset") 'Create an Excel connection Set objExcel = CreateObject("Excel.Application") With objExcel .Workbooks.Add .ActiveWorkbook.Sheets.Add .ActiveSheet.Name = "Visual Permissions" 'Remove the three default worksheets For i = 1 To .ActiveWorkbook.Sheets.Count If .ActiveWorkbook.Sheets(i).Name = "Sheet1" Then .Sheets("Sheet1").Select .ActiveWindow.SelectedSheets.Delete End If If .ActiveWorkbook.Sheets(i).Name = "Sheet2" Then .Sheets("Sheet2").Select .ActiveWindow.SelectedSheets.Delete End If If .ActiveWorkbook.Sheets(i).Name = "Sheet3" Then .Sheets("Sheet3").Select .ActiveWindow.SelectedSheets.Delete End If Next .Visible = True End With strSQL = "SELECT" & vbCrLf strSQL = strSQL & " PROGRAM_ID," & vbCrLf strSQL = strSQL & " MENU_STRING" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " APPLICATION_LIST" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " PROGRAM_ID NOT IN ('.SEPARATOR')" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " MENU_STRING" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then strProgramName = snpData.GetRows(30000) snpData.Close End If 'Set the number of elements in the strPermission array to match the number of application module names ReDim strPermission(UBound(strProgramName, 2)) 'Copy the module names into Excel For j = 0 To UBound(strPermission) strPermission(j) = strProgramName(1, j) ' & " - " & strProgramName(0, j) Next With objExcel .Application.ScreenUpdating = False .ActiveSheet.Range(.ActiveSheet.Cells(1, 2), .ActiveSheet.Cells(1, 1 + UBound(strProgramName, 2))) = strPermission End With 'Retrieve the list of users strSQL = "SELECT" & vbCrLf strSQL = strSQL & " NAME" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " USER_LIST" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " TYPE=1" & vbCrLf strSQL = strSQL & " AND NAME NOT LIKE '%#'" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " NAME" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then strEmployeename = snpData.GetRows(30000) snpData.Close End If 'Set the SQL statement to use to retrieve permissions, ? is a bind variable placeholder strSQL = "SELECT" & vbCrLf strSQL = strSQL & " USER_ID," & vbCrLf strSQL = strSQL & " PROGRAM_ID," & vbCrLf strSQL = strSQL & " PERMISSION" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " USER_PROGRAM_PERMISSION" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " PROGRAM_COMPONENT='PROGRAM'" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " USER_ID," & vbCrLf strSQL = strSQL & " PROGRAM_ID" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then strAllPermission = snpData.GetRows(600000) snpData.Close End If strLastUser = "" intLastPermission = 0 'Loop through all users For i = 0 To UBound(strEmployeename, 2) 'Reset the permissions for the next user For j = 0 To UBound(strPermission) strPermission(j) = "Y" Next For j = intLastPermission To UBound(strAllPermission, 2) If strAllPermission(0, j) = strEmployeename(0, i) Then 'Examine the permissions for this user For k = 0 To UBound(strProgramName, 2) If strProgramName(0, k) = strAllPermission(1, j) Then strPermission(k) = strAllPermission(2, j) Exit For End If Next End If 'Record the loop position so that we do not start at 0 for the next user intLastPermission = j If strAllPermission(0, j) > strEmployeename(0, i) Then 'We have passed the last permission for this user, exit the For loop Exit For End If Next With objExcel .ActiveSheet.Cells(i + 2, 1) = strEmployeename(0, i) .ActiveSheet.Range(.ActiveSheet.Cells(i + 2, 2), .ActiveSheet.Cells(i + 2, 1 + UBound(strProgramName, 2))) = strPermission End With Next 'Convert the number of columns into letter notation strLastColumn = Chr(64 + Int((UBound(strProgramName, 2)) / 26)) & Chr(64 + ((UBound(strProgramName, 2)) Mod 26 + 1)) 'Final cleanup With objExcel .ActiveSheet.Range(.ActiveSheet.Cells(1, 2), .ActiveSheet.Cells(1, 1 + UBound(strProgramName, 2))).Orientation = 90 .ActiveSheet.Columns("A:" & strLastColumn).AutoFit .Application.ScreenUpdating = True .ActiveSheet.Range("B2").Select .ActiveWindow.FreezePanes = True End With dbDatabase.Close Set snpData = Nothing Set dbDatabase = Nothing Set objExcel = Nothing
While the second version of the script is better than the first, we are still sending three SQL statements to the server. We can improve that with a Cartesian join. Let’s take a look at version 3 of the script (IntentionalCartesian3-Cartesian.VBS – save as IntentionalCartesian3-Cartesian.VBS)
Dim i Dim intUserCount Dim intPermissionCount Dim strLastColumn Dim strLastUser Dim strSQL Dim strPermission(500) Dim strModuleName(500) Dim strUsername Dim strPassword Dim strDatabase Dim snpData Dim dbDatabase Dim objExcel Set dbDatabase = CreateObject("ADODB.Connection") Set snpData = CreateObject("ADODB.Recordset") On Error Resume Next strUsername = "MyUsername" strPassword = "MyPassword" strDatabase = "MyDB" dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";" dbDatabase.Open 'Should verify that the connection attempt was successful, but I will leave that for someone else to code Set snpData = CreateObject("adodb.recordset") 'Create an Excel connection Set objExcel = CreateObject("Excel.Application") With objExcel .Workbooks.Add .ActiveWorkbook.Sheets.Add .ActiveSheet.Name = "Application Permissions" 'Remove the three default worksheets For i = 1 To .ActiveWorkbook.Sheets.Count If .ActiveWorkbook.Sheets(i).Name = "Sheet1" Then .Sheets("Sheet1").Select .ActiveWindow.SelectedSheets.Delete End If If .ActiveWorkbook.Sheets(i).Name = "Sheet2" Then .Sheets("Sheet2").Select .ActiveWindow.SelectedSheets.Delete End If If .ActiveWorkbook.Sheets(i).Name = "Sheet3" Then .Sheets("Sheet3").Select .ActiveWindow.SelectedSheets.Delete End If Next .Visible = True End With strSQL = "SELECT" & vbCrLf strSQL = strSQL & " AUP.NAME USERNAME," & vbCrLf strSQL = strSQL & " AUP.MENU_STRING MODULE," & vbCrLf strSQL = strSQL & " NVL(UGA.PERMISSION,AUP.DEFAULT_PERMISSION) PERMISSION" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " (SELECT" & vbCrLf strSQL = strSQL & " UL.NAME NAME," & vbCrLf strSQL = strSQL & " A.PROGRAM_ID," & vbCrLf strSQL = strSQL & " A.MENU_STRING," & vbCrLf strSQL = strSQL & " 'Y' DEFAULT_PERMISSION" & vbCrLf strSQL = strSQL & " FROM" & vbCrLf strSQL = strSQL & " APPLICATION_LIST A," & vbCrLf strSQL = strSQL & " USER_LIST UL" & vbCrLf strSQL = strSQL & " WHERE" & vbCrLf strSQL = strSQL & " A.PROGRAM_ID NOT IN ('.SEPARATOR')" & vbCrLf strSQL = strSQL & " AND UL.NAME NOT LIKE '%#') AUP," & vbCrLf strSQL = strSQL & " (SELECT" & vbCrLf strSQL = strSQL & " USER_ID," & vbCrLf strSQL = strSQL & " PROGRAM_ID," & vbCrLf strSQL = strSQL & " PERMISSION" & vbCrLf strSQL = strSQL & " FROM" & vbCrLf strSQL = strSQL & " USER_PROGRAM_PERMISSION" & vbCrLf strSQL = strSQL & " WHERE" & vbCrLf strSQL = strSQL & " PROGRAM_COMPONENT='PROGRAM') UGA" & vbCrLf strSQL = strSQL & "WHERE" & vbCrLf strSQL = strSQL & " AUP.NAME=UGA.USER_ID(+)" & vbCrLf strSQL = strSQL & " AND AUP.PROGRAM_ID=UGA.PROGRAM_ID(+)" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " AUP.NAME," & vbCrLf strSQL = strSQL & " AUP.MENU_STRING" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then With objExcel .Application.ScreenUpdating = False strLastUser = "" intUserCount = 0 Do While Not snpData.EOF If strLastUser <> snpData("username") Then If strLastUser <> "" Then 'Write out the permissions for the previous user .ActiveSheet.Range(.ActiveSheet.Cells(intUserCount + 1, 1), .ActiveSheet.Cells(intUserCount + 1, 1 + intPermissionCount)) = strPermission End If If intUserCount = 1 Then 'Write out the module names .ActiveSheet.Range(.ActiveSheet.Cells(1, 1), .ActiveSheet.Cells(1, 1 + intPermissionCount)) = strModuleName End If strPermission(0) = snpData("username") intPermissionCount = 0 intUserCount = intUserCount + 1 End If intPermissionCount = intPermissionCount + 1 strPermission(intPermissionCount) = snpData("permission") strLastUser = snpData("username") If intUserCount = 1 Then 'Record the module names strModuleName(intPermissionCount) = snpData("module") End If snpData.MoveNext Loop If strLastUser <> "" Then 'Write out the permissions for the last user .ActiveSheet.Range(.ActiveSheet.Cells(intUserCount + 1, 1), .ActiveSheet.Cells(intUserCount + 1, 1 + intPermissionCount)) = strPermission End If strLastColumn = Chr(64 + Int((intPermissionCount) / 26)) & Chr(64 + ((intPermissionCount) Mod 26 + 1)) .ActiveSheet.Range(.ActiveSheet.Cells(1, 2), .ActiveSheet.Cells(1, 1 + intPermissionCount)).Orientation = 90 .ActiveSheet.Columns("A:" & strLastColumn).AutoFit .Application.ScreenUpdating = True .ActiveWindow.FreezePanes = False .ActiveSheet.Range("B2").Select .ActiveWindow.FreezePanes = True .Application.ScreenUpdating = True End With End If snpData.Close dbDatabase.Close Set snpData = Nothing Set dbDatabase = Nothing Set objExcel = Nothing
Notice in the above that the client-side code is much smaller, and we have collapsed the three SQL statements into a single SQL statement with the help of a Cartesian join between the APPLICATION_LIST and USER_LIST tables. The end result looks like this:
Hmm. So there are some uses for cartesian products!
I kept finding them in queries over the years. They were very slow to finish, produced way too many rows, and strange, inaccurate results.
What was odd was that while it was blantantly obvious to me, my coworkers didn’t see it as a cartesian product. They kept barking up the statistics
and index trees. But actually, those can’t speed them up at all.
So, I made a presentation on cartesian products, and presented it at the Oracle User’s Group. See it here:
http://www.lepinsky.com/cartesian/index.htm
It lists five ways to get CP in queries. If you know of any more, please let me know.
The only two real uses that I came up with for CP were
– inserting more test dummy data quickly using rownum
– stress testing your servers
Hi Roger,
Thanks for stopping by and sharing a link to your presentation.
Yes, there are valid uses for Cartesian products – they are rare. In this example I used one to efficiently find all possible combinations of two row sources.
Other ways Cartesian joins may appear, even if the developer has written the SQL statement correctly:
* The query optimizer using transitive closure to generate additional predicates, and in the process the optimizer automatically removed the join condition between those two tables. Example 1 from the Cost-Based Oracle Fundamentals book Example 2 from that author’s website.
* The optimizer incorrectly calculates the row source cardinality of at least one of the row sources as a very small number, such as 1, and determines that a Cartesian join between two possibly unrelated row sources is safe and efficient. Randolf Geist and I included an example of this behavior in chapter 9 of the Expert Oracle Practices book. In this case, all of the correct join conditions were provided in the query.
I am not sure that I would classify your #3 and #4 items as something that would necessarily cause a Cartesian join. I think that those examples could be considered a legitimate 1 to many type join. This, of course, is just my opinion, and I could be completely wrong.
One more use for Cartesian joins: stress testing the server admins or the database admins. 🙂
[…] Charles Hooper […]
[…] on your own) that leaving out where clauses is a bad thing. Well, leave it to Charles Hooper on his Oracle Notes to prove to us that sometimes, they are actually more efficient for solving certain types of […]