August 5, 2011 (Modified August 7, 2011)
(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)
In part 1 of this series the challenge was to simply reverse the order of digits in the numbers from 1 to 1,000,000 to find that cases where the numbers formed by the reverse ordered digits evenly divided into the original number. In part 2 of this series the challenge required examining all of the numbers between 1000 and 9999, where arranging the digits of the original number into any of 23 valid combinations resulted in a new number that cleanly divided into the original four digit number. There were several different solutions provided to the two challenges, so now it is time to move on to part three of the series.
In part 1 of this blog article series I mentioned playing a game years ago that used letters on the face of dice – the dice were rolled, and then the challenge was to find all words that could be completely spelled using the letters on the top of the dice. I was not very good at the game, so I enlisted the help of a computer. One such dice game is called Boggle, and that game’s name is probably fitting for today’s challenge. Imagine that you played this game and the following letters appeared on the top of the dice:
One of the rules of the game requires that words must be at least 3 letters in length, for example: you say melee eye (I) see elfs file some mail (OK, the word I is too short, but we can have some fun with the words that are found). As you might be able to guess, there are a lot of possible combinations of the 16 letters found on the dice, some of which are valid words. If we just consider the 5 letter, 4 letter, and 3 letter combinations of the dice, there are more than a half million possible combinations (in the following table, multiply the numbers across and add the results for each row) – no wonder I needed the computer’s help with these puzzles.
16  15  14  13  12  = 16! / 11!  
16  15  14  13  = 16! / 12!  
16  15  14  = 16! / 13!  
= 571,200 
To make the full challenge of finding words a little easier, let’s break the challenge into a couple of parts:
Part 1: Consider the 2 x 2 letter arrangement at the left. With the help of Oracle Database, list all of the three letter combinations of those four letters. There will be 4 * 3 * 2 = 24 possible combinations of the letters.
Part 2: Consider the 4 x 4 letter arrangement at the left. With the help of Oracle Database, list all of the four letter combinations of those 16 letters. There will be 16 * 15 * 14 * 13 = 43,680 possible combinations of the letters.
–
–
–
–
–
Part 3: Consider the 4 x 4 letter arrangement above. With the help of Oracle Database, list all of the three, four, five, and six letter combinations of those 16 letters. If you see any seven letter words in the above set of letters, you might as well retrieve those letter combinations also. How many letter combinations do you have in total for part 3?
Part 4: Extra Credit: How many of the letter combinations generated in part 3 above are valid U.S. or U.K. English words? List the words.
Part 5: Extra, Extra Credit: List any words found in the letters at the left that have any connection to Oracle Corporation. Remember that a letter can only be used as many times in a single word as it appears at the left (if you can form a word with three letter A’s that have a connection to Oracle Corp., go for it.).
–
–
–
–
–
–
–
–
–
–
–
–
–
–
Added August 7, 2011:
When I put together this challenge I did not think that it was possible to complete Part 4 Extra Credit using just SQL. I was fairly certain that there were some interesting techniques to retrieve HTML content with the help of PL/SQL, but I had not worked out a solution that utilized that technique. As I write this, Radoslav Golian in the comments section appears to have both a PL/SQL and a SQL solution that uses the dictionary.reference.com website to validate the words (only 6 words to avoid a denial of service type attack on the dictionary.reference.com website). One of the approaches that I considered, but did not develop, is something similar to how Radoslav verified the words, but I would use a VBS script to submit the request and check the result as is demonstrated in these two articles: Submit Input to an ASP Web Page and Retrieve the Result using VBS and Use VBS to Search for Oracle Books using Google’s Book Library.
The solution that I put together for Part 4 Extra Credit started with an Excel macro that I posted in another blog article, which was then converted to PL/SQL. I then transformed the PL/SQL for use in this article, and generated a new Excel macro from the PL/SQL code. The Excel macro (along with the calling code looks like this:
Sub StartBoggle() Call Boggle("ESOIMEFOALEUSAYE", 6, 3) End Sub Sub Boggle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer) Dim i As Integer Dim strCharacter(20) As String Dim intCharacterIndex(20) As Integer Dim intCharacters As Integer Dim intCharactersMax As Integer Dim intCharactersMin As Integer Dim intNumberOfSuppliedCharacters As Integer Dim intAdjustmentPosition As Integer Dim intFlag As Integer Dim strOutput As String Dim strWords(10000) As String Dim intWordCount As Integer Dim intFilenum As Integer intFilenum = FreeFile Open "C:\Words " & strCharacters & ".txt" For Output As #intFilenum If intMaxWordLength = 0 Then intCharactersMax = Len(strCharacters) Else If intMaxWordLength <= Len(strCharacters) Then intCharactersMax = intMaxWordLength Else intCharactersMax = Len(strCharacters) End If End If If intMinWordLength = 0 Then intCharactersMin = 3 Else If intMaxWordLength < intMinWordLength Then intCharactersMin = intCharactersMax Else intCharactersMin = intMinWordLength End If End If intNumberOfSuppliedCharacters = Len(strCharacters) For i = 1 To intNumberOfSuppliedCharacters strCharacter(i) = Mid(strCharacters, i, 1) Next i intCharacters = intCharactersMin  1 intWordCount = 0 Do While intCharacters < intCharactersMax intCharacters = intCharacters + 1 intAdjustmentPosition = 1 For i = 1 To intCharacters intCharacterIndex(i) = i Next i Do While intAdjustmentPosition > 0 intFlag = 0 For i = 1 To intAdjustmentPosition  1 If intCharacterIndex(i) = intCharacterIndex(intAdjustmentPosition) Then ' Found a duplicate index position in the other values to the left intFlag = 1 Exit For End If Next i If intFlag = 1 Then ' Try the next index position in this element intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1 Else If intAdjustmentPosition = intCharacters Then ' Output strOutput = "" For i = 1 To intCharacters strOutput = strOutput & strCharacter(intCharacterIndex(i)) Next i intFlag = 0 For i = intWordCount To 1 Step 1 If strOutput = strWords(i) Then intFlag = 1 Exit For End If Next i If intFlag = 0 Then If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then intWordCount = intWordCount + 1 strWords(intWordCount) = strOutput Print #intFilenum, strOutput Debug.Print strOutput End If End If If intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters Then ' No more available values in the last position intCharacterIndex(intAdjustmentPosition) = 1 intAdjustmentPosition = intAdjustmentPosition  1 If intAdjustmentPosition > 0 Then intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1 End If Else intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1 End If Else ' No duplicate so prepare to check the next position intAdjustmentPosition = intAdjustmentPosition + 1 End If End If Do While (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters) ' Roll back one index position as many times as necessary intCharacterIndex(intAdjustmentPosition) = 1 intAdjustmentPosition = intAdjustmentPosition  1 If intAdjustmentPosition > 0 Then intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1 End If Loop ' (intAdjustmentPosition > 0) And Loop 'intAdjustmentPosition > 0 Loop 'intCharacters < intCharactersMax Close #intFilenum End Sub
The Excel macro builds letter combinations that are between the minimum and maximum length, and then tests those letter combinations using the builtin dictionary that is in Excel. I had a little bit of difficulty coming up with a way to generate the letter combinations of variable length, so I settled on a custom developed technique – I would simply keep track of the original character positions, manipulate those original character positions, and then output the corresponding characters. The challenge is then how does one verify that the same character position is not used more than once in a single word?
The method that I came up with is as follows, which assumes that we are trying to build four letter words from the supplied 16 letters. We can start with the seed combination 1,2,3,4. The idea is to work from left to right, and then back to the left. Every time to make it to the right, we output a word, when we make it all the way back to the left (just before the number 1 in the above), we are done. The rules are simple:
 Increment the number in a position, and if that number does not appear in a position to the left, move one position to the right.
 When the maximum character number (16 in this example) is exceeded in a position, reset the number to 1, move one position to the left, and increment the value in the new position by 1.
 In the last position the character number should be incremented as many times as necessary to reach the maximum character number – each time a potential new combination will be generated.
But there is a problem with this approach – it does not use Oracle Database!
–
Let’s go back to the PL/SQL function from which I created the Excel function (I have not worked much with pipelined functions – so there may be one or two errors):
CREATE OR REPLACE FUNCTION BOGGLE_VAR_LENGTH(strCHARACTERS IN VARCHAR2, intMaxWordLength IN NUMBER, intMinWordLength IN NUMBER) RETURN SYS.AQ$_MIDARRAY PIPELINED AS TYPE NUMBER_ARRAY IS TABLE OF NUMBER INDEX BY PLS_INTEGER; TYPE CHARACTER_ARRAY IS TABLE OF VARCHAR(1) INDEX BY PLS_INTEGER; strCharacter CHARACTER_ARRAY; intCharacterIndex NUMBER_ARRAY; intCharacters NUMBER; intCharactersMax NUMBER; intCharactersMin NUMBER; intNumberOfSuppliedCharacters NUMBER; intAdjustmentPosition NUMBER; intFlag NUMBER; intI NUMBER; strOutput VARCHAR2(100); BEGIN IF intMaxWordLength IS NULL THEN intCharactersMax := LENGTH(strCHARACTERS); ELSE IF intMaxWordLength <= LENGTH(strCHARACTERS) THEN intCharactersMax := intMaxWordLength; ELSE intCharactersMax := LENGTH(strCHARACTERS); END IF; END IF; IF intMinWordLength IS NULL THEN intCharactersMin := 3; ELSE IF intMaxWordLength < intMinWordLength THEN intCharactersMin := intCharactersMax; ELSE intCharactersMin := intMinWordLength; END IF; END IF; intNumberOfSuppliedCharacters := LENGTH(strCHARACTERS); FOR I IN 1.. intNumberOfSuppliedCharacters LOOP strCharacter(I) := SUBSTR(strCHARACTERS, I, 1); END LOOP; intCharacters := intCharactersMin  1; WHILE intCharacters < intCharactersMax LOOP intCharacters := intCharacters + 1; intAdjustmentPosition := 1; FOR I IN 1 .. intCharacters LOOP intCharacterIndex(I) := I; END LOOP; WHILE intAdjustmentPosition > 0 LOOP intFlag := 0; FOR I IN 1 .. intAdjustmentPosition  1 LOOP IF intCharacterIndex(I) = intCharacterIndex(intAdjustmentPosition) Then  Found a duplicate index position in the other values to the left intFlag := 1; END IF; END LOOP; IF intFlag = 1 Then  Try the next index position in this element intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1; ELSE IF intAdjustmentPosition = intCharacters Then  Output strOutput := ''; FOR i IN 1 .. intCharacters LOOP strOutput := strOutput  strCharacter(intCharacterIndex(i)); END LOOP; PIPE ROW (strOutput); IF intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters THEN  No more available values in the last position intCharacterIndex(intAdjustmentPosition) := 1; intAdjustmentPosition := intAdjustmentPosition  1; IF intAdjustmentPosition > 0 THEN intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1; END IF; ELSE intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1; END IF; ELSE  No duplicate so prepare to check the next position intAdjustmentPosition := intAdjustmentPosition + 1; END IF; END IF; WHILE (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters) LOOP  Roll back one index position as many times as necessary intCharacterIndex(intAdjustmentPosition) := 1; intAdjustmentPosition := intAdjustmentPosition  1; IF intAdjustmentPosition > 0 THEN intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1; END IF; END LOOP; END LOOP; END LOOP; END; /
We are able to call the function from a SQL statement like this:
SELECT * FROM TABLE(BOGGLE_VAR_LENGTH('ESOIMEFOALEUSAYE', 6, 3));
Remember that there are more than a half million character combinations for just the 3, 4, and 5 letter combinations – the above will as for 6,336,960 letter combinations to be generated. But there is a problem with this approach – it does not verify that the letter combinations are actual words!
For fun, let’s see how many possible combinations will result if we allow 3, 4, 5, 6, 7, and 8 letter combinations:
Len  Combinations  
8  16  15  14  13  12  11  10  9  518,918,400  = 16! / 8! 
7  16  15  14  13  12  11  10  57,657,600  = 16! / 9!  
6  16  15  14  13  12  11  5,765,760  = 16! / 10!  
5  16  15  14  13  12  524,160  = 16! / 11!  
4  16  15  14  13  43,680  = 16! / 12!  
3  16  15  14  3,360  = 16! / 13!  
582,912,960  582,912,960 
That is more than a half billion combinations! Warning, significant database server CPU consumption will result when generating all combinations.
Let’s take a look at the final solution that I created for Part 4 Extra, Extra Credit. The solution is an Excel macro that calls the PL/SQL function through a SQL statement:
Sub StartBoggleOracle() Call BoggleOracle("ESOIMEFOALEUSAYE", 8, 3) End Sub Sub BoggleOracle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer) Dim strSQL As String Dim strUsername As String Dim strPassword As String Dim strDatabase As String Dim intFilenum As Integer Dim intCharacters As Integer Dim intCharactersMax As Integer Dim intCharactersMin As Integer Dim strOutput As String Dim dbDatabase As ADODB.Connection Dim snpData As ADODB.Recordset Set dbDatabase = New ADODB.Connection Set snpData = New ADODB.Recordset strUsername = "MyUsername" strPassword = "MyPassword" strDatabase = "MyDatabase" dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";FetchSize=5000;" dbDatabase.Open intFilenum = FreeFile Open "C:\WordsOracle " & strCharacters & ".txt" For Output As #intFilenum If intMaxWordLength = 0 Then intCharactersMax = Len(strCharacters) Else If intMaxWordLength <= Len(strCharacters) Then intCharactersMax = intMaxWordLength Else intCharactersMax = Len(strCharacters) End If End If If intMinWordLength = 0 Then intCharactersMin = 3 Else If intMaxWordLength < intMinWordLength Then intCharactersMin = intCharactersMax Else intCharactersMin = intMinWordLength End If End If strSQL = "SELECT DISTINCT" & vbCrLf strSQL = strSQL & " *" & vbCrLf strSQL = strSQL & "FROM" & vbCrLf strSQL = strSQL & " (SELECT" & vbCrLf strSQL = strSQL & " *" & vbCrLf strSQL = strSQL & " FROM" & vbCrLf strSQL = strSQL & " TABLE(BOGGLE_VAR_LENGTH('" & strCharacters & "', " & Format(intCharactersMax) & ", " & Format(intCharactersMin) & ")))" & vbCrLf strSQL = strSQL & "ORDER BY" & vbCrLf strSQL = strSQL & " 1" snpData.Open strSQL, dbDatabase If snpData.State = 1 Then Do While Not snpData.EOF strOutput = snpData(0) If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then Print #intFilenum, strOutput Debug.Print strOutput End If snpData.MoveNext Loop snpData.Close End If Close #intFilenum dbDatabase.Close Set snpData = Nothing Set dbDatabase = Nothing End Sub
The words found appear to depend on the version of Excel – Excel 2010 seems to find more words than Excel 2007.

The 799 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished. Words Oracle_ESOIMEFOALEUSAYE.txt

The 2,179 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished. Words Oracle_OSERIEFAARLNCAYL.txt
Excel found Ellison in the second word list. For Part 5 Extra, Extra Credit, what other words connected to Oracle Corporation were found?
Recent Comments