The New Order Oracle Coding Challenge 3 – Mind Boggle

5 08 2011

August 5, 2011 (Modified August 7, 2011)

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

In part 1 of this series the challenge was to simply reverse the order of digits in the numbers from 1 to 1,000,000 to find that cases where the numbers formed by the reverse ordered digits evenly divided into the original number.  In part 2 of this series the challenge required examining all of the numbers between 1000 and 9999, where arranging the digits of the original number into any of 23 valid combinations resulted in a new number that cleanly divided into the original four digit number.  There were several different solutions provided to the two challenges, so now it is time to move on to part three of the series.

In part 1 of this blog article series I mentioned playing a game years ago that used letters on the face of dice – the dice were rolled, and then the challenge was to find all words that could be completely spelled using the letters on the top of the dice.  I was not very good at the game, so I enlisted the help of a computer.  One such dice game is called Boggle, and that game’s name is probably fitting for today’s challenge.  Imagine that you played this game and the following letters appeared on the top of the dice:

One of the rules of the game requires that words must be at least 3 letters in length, for example: you say melee eye (I) see elfs file some mail (OK, the word I is too short, but we can have some fun with the words that are found).  As you might be able to guess, there are a lot of possible combinations of the 16 letters found on the dice, some of which are valid words.  If we just consider the 5 letter, 4 letter, and 3 letter combinations of the dice, there are more than a half million possible combinations (in the following table, multiply the numbers across and add the results for each row) – no wonder I needed the computer’s help with these puzzles.

16 15 14 13 12   = 16! / 11!
16 15 14 13     = 16! / 12!
16 15 14       = 16! / 13!
            = 571,200

To make the full challenge of finding words a little easier, let’s break the challenge into a couple of parts:

 Part 1: Consider the 2 x 2 letter arrangement at the left.  With the help of Oracle Database, list all of the three letter combinations of those four letters.  There will be 4 * 3 * 2 = 24 possible combinations of the letters.

 

 

 Part 2: Consider the 4 x 4 letter arrangement at the left.  With the help of Oracle Database, list all of the four letter combinations of those 16 letters.  There will be 16 * 15 * 14 * 13 = 43,680 possible combinations of the letters.

Part 3: Consider the 4 x 4 letter arrangement above.  With the help of Oracle Database, list all of the three, four, five, and six letter combinations of those 16 letters.  If you see any seven letter words in the above set of letters, you might as well retrieve those letter combinations also.  How many letter combinations do you have in total for part 3?

Part 4: Extra Credit: How many of the letter combinations generated in part 3 above are valid U.S. or U.K. English words?  List the words.

Part 5: Extra, Extra Credit: List any words found in the letters at the left that have any connection to Oracle Corporation.  Remember that a letter can only be used as many times in a single word as it appears at the left (if you can form a word with three letter A’s that have a connection to Oracle Corp., go for it.).

Added August 7, 2011:

When I put together this challenge I did not think that it was possible to complete Part 4 Extra Credit using just SQL.  I was fairly certain that there were some interesting techniques to retrieve HTML content with the help of PL/SQL, but I had not worked out a solution that utilized that technique.  As I write this, Radoslav Golian in the comments section appears to have both a PL/SQL and a SQL solution that uses the dictionary.reference.com website to validate the words (only 6 words to avoid a denial of service type attack on the dictionary.reference.com website).  One of the approaches that I considered, but did not develop, is something similar to how Radoslav verified the words, but I would use a VBS script to submit the request and check the result as is demonstrated in these two articles: Submit Input to an ASP Web Page and Retrieve the Result using VBS and Use VBS to Search for Oracle Books using Google’s Book Library.

The solution that I put together for Part 4 Extra Credit started with an Excel macro that I posted in another blog article, which was then converted to PL/SQL.  I then transformed the PL/SQL for use in this article, and generated a new Excel macro from the PL/SQL code.  The Excel macro (along with the calling code looks like this:

Sub StartBoggle()
    Call Boggle("ESOIMEFOALEUSAYE", 6, 3)
End Sub

Sub Boggle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer)
    Dim i As Integer
    Dim strCharacter(20) As String
    Dim intCharacterIndex(20) As Integer
    Dim intCharacters As Integer
    Dim intCharactersMax As Integer
    Dim intCharactersMin As Integer
    Dim intNumberOfSuppliedCharacters As Integer
    Dim intAdjustmentPosition As Integer
    Dim intFlag As Integer
    Dim strOutput As String
    Dim strWords(10000) As String
    Dim intWordCount As Integer
    Dim intFilenum As Integer

    intFilenum = FreeFile
    Open "C:\Words " & strCharacters & ".txt" For Output As #intFilenum

    If intMaxWordLength = 0 Then
        intCharactersMax = Len(strCharacters)
    Else
        If intMaxWordLength <= Len(strCharacters) Then
            intCharactersMax = intMaxWordLength
        Else
            intCharactersMax = Len(strCharacters)
        End If
    End If

    If intMinWordLength = 0 Then
        intCharactersMin = 3
    Else
        If intMaxWordLength < intMinWordLength Then
            intCharactersMin = intCharactersMax
        Else
            intCharactersMin = intMinWordLength
        End If
    End If

    intNumberOfSuppliedCharacters = Len(strCharacters)

    For i = 1 To intNumberOfSuppliedCharacters
        strCharacter(i) = Mid(strCharacters, i, 1)
    Next i

    intCharacters = intCharactersMin - 1
    intWordCount = 0

    Do While intCharacters < intCharactersMax
        intCharacters = intCharacters + 1
        intAdjustmentPosition = 1
        For i = 1 To intCharacters
            intCharacterIndex(i) = i
        Next i

        Do While intAdjustmentPosition > 0
            intFlag = 0
            For i = 1 To intAdjustmentPosition - 1
                If intCharacterIndex(i) = intCharacterIndex(intAdjustmentPosition) Then
                    ' Found a duplicate index position in the other values to the left
                    intFlag = 1
                    Exit For
                End If
            Next i

            If intFlag = 1 Then
                ' Try the next index position in this element
                intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
            Else
                If intAdjustmentPosition = intCharacters Then
                    ' Output
                    strOutput = ""
                    For i = 1 To intCharacters
                        strOutput = strOutput & strCharacter(intCharacterIndex(i))
                    Next i

                    intFlag = 0
                    For i = intWordCount To 1 Step -1
                        If strOutput = strWords(i) Then
                            intFlag = 1
                            Exit For
                        End If
                    Next i
                    If intFlag = 0 Then
                        If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then
                            intWordCount = intWordCount + 1
                            strWords(intWordCount) = strOutput

                            Print #intFilenum, strOutput
                            Debug.Print strOutput
                        End If
                    End If

                    If intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters Then
                        ' No more available values in the last position
                        intCharacterIndex(intAdjustmentPosition) = 1
                        intAdjustmentPosition = intAdjustmentPosition - 1
                        If intAdjustmentPosition > 0 Then
                            intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
                        End If
                    Else
                        intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
                    End If
                Else
                    ' No duplicate so prepare to check the next position
                    intAdjustmentPosition = intAdjustmentPosition + 1
                End If
            End If

            Do While (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters)
                ' Roll back one index position as many times as necessary
                intCharacterIndex(intAdjustmentPosition) = 1
                intAdjustmentPosition = intAdjustmentPosition - 1
                If intAdjustmentPosition > 0 Then
                    intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
                End If
            Loop ' (intAdjustmentPosition > 0) And
        Loop 'intAdjustmentPosition > 0
    Loop 'intCharacters < intCharactersMax

    Close #intFilenum
End Sub 

The Excel macro builds letter combinations that are between the minimum and maximum length, and then tests those letter combinations using the built-in dictionary that is in Excel.  I had a little bit of difficulty coming up with a way to generate the letter combinations of variable length, so I settled on a custom developed technique – I would simply keep track of the original character positions, manipulate those original character positions, and then output the corresponding characters.  The challenge is then how does one verify that the same character position is not used more than once in a single word? 

 The method that I came up with is as follows, which assumes that we are trying to build four letter words from the supplied 16 letters.  We can start with the seed combination 1,2,3,4.  The idea is to work from left to right, and then back to the left.  Every time to make it to the right, we output a word, when we make it all the way back to the left (just before the number 1 in the above), we are done.  The rules are simple:

  • Increment the number in a position, and if that number does not appear in a position to the left, move one position to the right.
  • When the maximum character number (16 in this example) is exceeded in a position, reset the number to 1, move one position to the left, and increment the value in the new position by 1.
  • In the last position the character number should be incremented as many times as necessary to reach the maximum character number – each time a potential new combination will be generated.

 But there is a problem with this approach – it does not use Oracle Database!

Let’s go back to the PL/SQL function from which I created the Excel function (I have not worked much with pipelined functions – so there may be one or two errors):

CREATE OR REPLACE FUNCTION BOGGLE_VAR_LENGTH(strCHARACTERS IN VARCHAR2, intMaxWordLength IN NUMBER, intMinWordLength IN NUMBER) RETURN SYS.AQ$_MIDARRAY PIPELINED
AS
  TYPE NUMBER_ARRAY IS TABLE OF NUMBER INDEX BY PLS_INTEGER;
  TYPE CHARACTER_ARRAY IS TABLE OF VARCHAR(1) INDEX BY PLS_INTEGER;
  strCharacter CHARACTER_ARRAY;
  intCharacterIndex NUMBER_ARRAY;
  intCharacters NUMBER;
  intCharactersMax NUMBER;
  intCharactersMin NUMBER;
  intNumberOfSuppliedCharacters NUMBER;
  intAdjustmentPosition NUMBER;
  intFlag NUMBER;
  intI NUMBER;
  strOutput VARCHAR2(100);
BEGIN
  IF intMaxWordLength IS NULL THEN
    intCharactersMax := LENGTH(strCHARACTERS);
  ELSE
    IF intMaxWordLength <= LENGTH(strCHARACTERS) THEN
      intCharactersMax := intMaxWordLength;
    ELSE
      intCharactersMax := LENGTH(strCHARACTERS);
    END IF;
  END IF;

  IF intMinWordLength IS NULL THEN
    intCharactersMin := 3;
  ELSE
    IF intMaxWordLength < intMinWordLength THEN
      intCharactersMin := intCharactersMax;
    ELSE
      intCharactersMin := intMinWordLength;
    END IF;
  END IF;

  intNumberOfSuppliedCharacters := LENGTH(strCHARACTERS);

  FOR I IN 1.. intNumberOfSuppliedCharacters LOOP
    strCharacter(I) := SUBSTR(strCHARACTERS, I, 1);
  END LOOP;

  intCharacters := intCharactersMin - 1;
  WHILE intCharacters < intCharactersMax LOOP
    intCharacters := intCharacters + 1;
    intAdjustmentPosition := 1;
    FOR I IN 1 .. intCharacters LOOP
      intCharacterIndex(I) := I;
    END LOOP;

    WHILE intAdjustmentPosition > 0 LOOP
      intFlag := 0;
      FOR I IN 1 .. intAdjustmentPosition - 1 LOOP
        IF intCharacterIndex(I) = intCharacterIndex(intAdjustmentPosition) Then
          -- Found a duplicate index position in the other values to the left
          intFlag := 1;
        END IF;
      END LOOP;
      IF intFlag = 1 Then
        -- Try the next index position in this element
        intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
      ELSE
        IF intAdjustmentPosition = intCharacters Then
          -- Output
          strOutput := '';
          FOR i IN 1 .. intCharacters LOOP
            strOutput := strOutput || strCharacter(intCharacterIndex(i));
          END LOOP;

          PIPE ROW (strOutput);

          IF intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters THEN
            -- No more available values in the last position
            intCharacterIndex(intAdjustmentPosition) := 1;
            intAdjustmentPosition := intAdjustmentPosition - 1;
            IF intAdjustmentPosition > 0 THEN
              intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
            END IF;
          ELSE
            intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
          END IF;
        ELSE
          -- No duplicate so prepare to check the next position
          intAdjustmentPosition := intAdjustmentPosition + 1;
        END IF;
      END IF;

      WHILE (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters) LOOP
        -- Roll back one index position as many times as necessary
        intCharacterIndex(intAdjustmentPosition) := 1;
        intAdjustmentPosition := intAdjustmentPosition - 1;
        IF intAdjustmentPosition > 0 THEN
          intCharacterIndex(intAdjustmentPosition) := intCharacterIndex(intAdjustmentPosition) + 1;
        END IF;
      END LOOP;
    END LOOP;
  END LOOP;
END;
/ 

 We are able to call the function from a SQL statement like this:

SELECT
  *
FROM
  TABLE(BOGGLE_VAR_LENGTH('ESOIMEFOALEUSAYE', 6, 3)); 

Remember that there are more than a half million character combinations for just the 3, 4, and 5 letter combinations – the above will as for 6,336,960 letter combinations to be generated.   But there is a problem with this approach – it does not verify that the letter combinations are actual words!

For fun, let’s see how many possible combinations will result if we allow 3, 4, 5, 6, 7, and 8 letter combinations:

Len                 Combinations  
8 16 15 14 13 12 11 10 9 518,918,400 = 16! / 8!
7 16 15 14 13 12 11 10   57,657,600 = 16! / 9!
6 16 15 14 13 12 11     5,765,760 = 16! / 10!
5 16 15 14 13 12       524,160 = 16! / 11!
4 16 15 14 13         43,680 = 16! / 12!
3 16 15 14           3,360 = 16! / 13!
                  582,912,960 582,912,960

That is more than a half billion combinations!  Warning, significant database server CPU consumption will result when generating all combinations.

Let’s take a look at the final solution that I created for Part 4 Extra, Extra Credit.  The solution is an Excel macro that calls the PL/SQL function through a SQL statement:

Sub StartBoggleOracle()
    Call BoggleOracle("ESOIMEFOALEUSAYE", 8, 3)
End Sub

Sub BoggleOracle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer)
    Dim strSQL As String
    Dim strUsername As String
    Dim strPassword As String
    Dim strDatabase As String
    Dim intFilenum As Integer

    Dim intCharacters As Integer
    Dim intCharactersMax As Integer
    Dim intCharactersMin As Integer
    Dim strOutput As String

    Dim dbDatabase As ADODB.Connection
    Dim snpData As ADODB.Recordset

    Set dbDatabase = New ADODB.Connection
    Set snpData = New ADODB.Recordset

    strUsername = "MyUsername"
    strPassword = "MyPassword"
    strDatabase = "MyDatabase"

    dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";FetchSize=5000;"
    dbDatabase.Open

    intFilenum = FreeFile
    Open "C:\WordsOracle " & strCharacters & ".txt" For Output As #intFilenum

    If intMaxWordLength = 0 Then
        intCharactersMax = Len(strCharacters)
    Else
        If intMaxWordLength <= Len(strCharacters) Then
            intCharactersMax = intMaxWordLength
        Else
            intCharactersMax = Len(strCharacters)
        End If
    End If

    If intMinWordLength = 0 Then
        intCharactersMin = 3
    Else
        If intMaxWordLength < intMinWordLength Then
            intCharactersMin = intCharactersMax
        Else
            intCharactersMin = intMinWordLength
        End If
    End If

    strSQL = "SELECT DISTINCT" & vbCrLf
    strSQL = strSQL & "  *" & vbCrLf
    strSQL = strSQL & "FROM" & vbCrLf
    strSQL = strSQL & "  (SELECT" & vbCrLf
    strSQL = strSQL & "    *" & vbCrLf
    strSQL = strSQL & "  FROM" & vbCrLf
    strSQL = strSQL & "    TABLE(BOGGLE_VAR_LENGTH('" & strCharacters & "', " & Format(intCharactersMax) & ", " & Format(intCharactersMin) & ")))" & vbCrLf
    strSQL = strSQL & "ORDER BY" & vbCrLf
    strSQL = strSQL & "  1"
    snpData.Open strSQL, dbDatabase

    If snpData.State = 1 Then
        Do While Not snpData.EOF
            strOutput = snpData(0)
            If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then
                Print #intFilenum, strOutput
                Debug.Print strOutput
            End If

            snpData.MoveNext
        Loop

        snpData.Close
    End If

    Close #intFilenum
    dbDatabase.Close
    Set snpData = Nothing
    Set dbDatabase = Nothing
End Sub

The words found appear to depend on the version of Excel – Excel 2010 seems to find more words than Excel 2007.

  • The 799 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished.  Words Oracle_ESOIMEFOALEUSAYE.txt
  • The 2,179 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished. Words Oracle_OSERIEFAARLNCAYL.txt

Excel found Ellison in the second word list.  For Part 5 Extra, Extra Credit, what other words connected to Oracle Corporation were found?


Actions

Information

19 responses

5 08 2011
RJ

No need for a computer, I think the word you are looking for is ‘Ellison’

5 08 2011
Charles Hooper

RJ,

You found a seven letter word? You already found Ellison among the 57,657,600 possible 7 letter combinations? At that rate, I think that I have found someone that would easily beat me (and my 1 MHz CPU) at this game. 🙂

5 08 2011
RJ

ah perhaps not the word with 3 A’s … oh well what could be more important than Ellison ?

5 08 2011
Charles Hooper

RJ,

I see what happened – a minor typo in the a sentence of Part 5 Extra, Extra Credit. The sentence originally stated:

List any word found in the letters at the left that have any connection to Oracle Corporation.

I have corrected the sentence to state:

List any words found in the letters at the left that have any connection to Oracle Corporation.

I can understand why you only picked out the 7 letter word – that is worth 5 points by the way. More fun, see if you are able to write a complete sentence with the words that are found. 😉

5 08 2011
Radoslav Golian

backtracking works for me :), I think it should be doable with singe SQL statement..

declare 
 type char_array is table of varchar2(1);
 type bool_array is table of boolean;
 type res_array is table of varchar2(1) index by varchar2(16);
 
 chars constant char_array := char_array('e','s','o','i','m','e','f','o','a','l','e','u','s','a','y','e');
 used constant bool_array := bool_array(false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false);
 g_result res_array;
 l_index varchar2(16);
 
 procedure generate(i_current_word in varchar2,
                    i_length_min in number,
                    i_length_max in number,
                    i_used_chars in bool_array default used) is 
   l_current_word varchar2(16):= i_current_word;
   l_used_chars    bool_array := i_used_chars;
   l_response      varchar2(32767);
 begin 
   if nvl(length(l_current_word), 0) >= i_length_min then     
     -- I don't want to spam dictonary.com site
     -- testing for few words
     if l_current_word in ('eye', 'say', 'elfs', 'melee', 'female') then 
       -- this awful, API should be used (they provides some)
       if instr(utl_http.request(url => 'http://dictionary.reference.com/browse/' || l_current_word),
          '<div id="sph">No results found for <i>'||l_current_word||'</i>:</div>') = 0
       then 
         --dbms_output.put_line(' correct:' || l_current_word);
         g_result(l_current_word) := null; -- to avoid duplicates, mark result
       end if;
     end if;  
     -- current word   
     --dbms_output.put_line(' word:' || l_current_word);         
   end if;
 
   if nvl(length(i_current_word), 0) < i_length_max then   
      for i in 1..chars.count() loop
        if not l_used_chars (i) then         
          l_used_chars (i) := true; -- mark move
          generate(i_current_word || chars(i), i_length_min, i_length_max, l_used_chars); -- try solution
          l_used_chars (i) := false;  -- unmark move
        end if;
      end loop;    

   end if;   
 end;

begin
  generate(null, 3, 6); -- 
  
  -- show result
  l_index := g_result.first();
  while (l_index is not null) loop
    dbms_output.put_line(l_index);
    l_index := g_result.next(l_index);
  end loop;
end;

5 08 2011
Radoslav Golian

l_response is not needed, I forgot to remove it..
only URL is passed to url parameter, tag was added be wordpress formatter.

5 08 2011
Radoslav Golian

I cleaned the code..
I removed some variables and I made boolean array global..

declare 
 type char_array is table of varchar2(1);
 type bool_array is table of boolean;
 type res_array is table of varchar2(1) index by varchar2(16);
 
 chars constant char_array := char_array('e','s','o','i','m','e','f','o','a','l','e','u','s','a','y','e');
 g_used_chars  bool_array := bool_array(false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false);
 g_result res_array;
 l_index varchar2(16);
 
 procedure generate(i_current_word in varchar2,
                    i_length_min in number,
                    i_length_max in number) is 
 begin 
   if nvl(length(i_current_word), 0) >= i_length_min then     
     -- I don't want to spam dictonary.com site
     -- testing for few words
     if i_current_word in ('eye', 'say', 'elfs', 'melee', 'female', 'xxkhs') then 
       -- this awful, API should be used (they provides some)
       if instr(utl_http.request(url => 'http://dictionary.reference.com/browse/' || i_current_word),
          '<div id="sph">No results found for <i>'||i_current_word||'</i>:</div>') = 0
       then 
         --dbms_output.put_line(' correct:' || l_current_word);
         g_result(i_current_word) := null; -- to avoid duplicates, mark result
       end if;
     end if;  
     -- current word   
     --dbms_output.put_line(' word:' || l_current_word);         
   end if;
 
   if nvl(length(i_current_word), 0) < i_length_max then   
      for i in 1..chars.count() loop
        if not g_used_chars (i) then         
          g_used_chars (i) := true; -- mark move
          generate(i_current_word || chars(i), i_length_min, i_length_max); -- try solution
          g_used_chars (i) := false;  -- unmark move
        end if;
      end loop;    

   end if;   
 end;

begin
  generate(null, 3, 6); 
  
  -- show result
  l_index := g_result.first();
  while (l_index is not null) loop
    dbms_output.put_line(l_index);
    l_index := g_result.next(l_index);
  end loop;
end;

5 08 2011
Charles Hooper

Radoslav,

Your script runs with the following output:

elfs
eye
female
melee
say

I have not completely determined how your script works, but it is impressive. Your solution is considerably shorter than the solution that I built – I will share my solution in a couple of days. It is great that you looked up the spelling of the words directly in the PL/SQL block – I did it a bit differently. Very impressive.

I am still terrible at playing this game – I think that I found about 15 words in the first 4 x 4 puzzle, while the computer claims that it found 837 words. A 3.4GHz Core i7 CPU worked for what seemed like 2 hours finding those 837 words. I wonder how many DBAs are currently wondering why their servers are so slow right now. 🙂

7 08 2011
Radoslav Golian

And here is solution with single sql statement (11.2+):

It’s much slower than pl/sql..

BEGIN
  DBMS_NETWORK_ACL_ADMIN.create_acl (
    acl          => 'mind_boogle_acl.xml', 
    description  => 'Mind Boogle ACL',
    principal    => 'RGOLIAN', -- schema name where the statement will run
    is_grant     => TRUE, 
    privilege    => 'connect',
    start_date   => NULL,
    end_date     => NULL);

  COMMIT;
END;

BEGIN
  DBMS_NETWORK_ACL_ADMIN.assign_acl (
    acl         => 'mind_boogle_acl.xml',
    host        => 'dictionary.reference.com', 
    lower_port  => NULL,
    upper_port  => NULL); 

  COMMIT;
END;

with backtrack(not_used_chars, current_word, max_length) as(
  select  -- setup
  'esoimefoaleusaye' not_used_chars, -- set of letters to be used
   cast(null as varchar2(16)) current_word, 
   6 max_length  -- upper word length limit
  from dual
  union all
  select 
    substr(b.not_used_chars, 1, a.lev - 1) || 
    substr(b.not_used_chars, a.lev + 1) not_used_chars, -- remove used letter from not_used_chars
    b.current_word || substr(b.not_used_chars, a.lev, 1) current_word, -- create a word by adding a letter
    max_length -- just forward max_length
  from backtrack b, 
       (select level lev from dual connect by level <= 16) a  -- connect by cannot be used directly in recursive with clause
  where nvl(length(b.not_used_chars), 0) > 0 -- not used chars cannot be exhausted
  and nvl(length(b.current_word), 0) < b.max_length -- check upper word length limit
  and a.lev <= nvl(length(b.not_used_chars), 0) -- we can add only those letters that are in not_used_chars
) 
select distinct current_word from backtrack
where current_word is not null -- filter the first null word from non-recursive branch
and length(current_word) >= 3 -- lower word length limit
-- above is the solution, next lines below is only for testing
-- I'm testing only few word (I don't want spam dictionary.com)
and current_word in ('eye', 'say', 'elfs', 'melee', 'female', 'emiose') -- the last one is not an english word
-- check the word 
and instr(utl_http.request(url => 'http://dictionary.reference.com/browse/' || current_word),
          '<div id="sph">No results found for <i>'||current_word||'</i>:</div>') = 0
7 08 2011
Radoslav Golian

condition:

 "nvl(length(b.not_used_chars), 0) > 0 -- not used chars cannot be exhausted"  

is useless and could be removed…

7 08 2011
Radoslav Golian

“and instr(…)” is part of a solution, of course 🙂

I should double check a post before posting it 🙂

7 08 2011
Charles Hooper

Radoslav,

That is an impressive solution – I did not think that it was possible with just a single SQL statement. I am not sure if it is a sign of a bug on 11.2.0.2 patch 4 on Windows x64, a time-out issue, or something else – my output is:

CURRENT_WORD
----------------
eye
elfs
melee
emiose
say
female

6 rows selected.

I made certain that the INSTR was included in the SQL statement. If I view the source code for the web page I am able to see the text that you are including in the INSTR function.

7 08 2011
Radoslav Golian

It’s not a oracle bug, it’s my bug, but I believe the minor one :). The searched string is not in first 2000 bytes, so it always returns 0..
The pl/sql is also not correct but “xxkhs” can’t be generated from those letters..
The pl/sql could be rewritten using utl_http.request_pieces..

I have to found an API, where response is smaller, I think there are few.. dictionary.com also provides some API, but registration is needed and I’m to lazy to register 🙂

7 08 2011
Radoslav Golian

use this condition for sql

instr(utl_http.request(url => 'http://oxforddictionaries.com/definition/' || current_word), 'HTTP Status 404') = 0

and this one for pl/sql

instr(utl_http.request(url => 'http://oxforddictionaries.com/definition/' || i_current_word), 'HTTP Status 404') = 0

ACL:

BEGIN
  DBMS_NETWORK_ACL_ADMIN.assign_acl (
    acl         => 'mind_boogle_acl.xml',
    host        => 'oxforddictionaries.com',
    lower_port  => NULL,
    upper_port  => NULL); 

  COMMIT;
END;

but this directory didn’t find elfs.. so again it’s not completely correct..

7 08 2011
Charles Hooper

Radoslav,

Well done – the SQL statement now returns:

CURRENT_WORD
------------
eye
melee
say
female

I do not think that “elfs” appears in many English dictionaries – the word is usually spelled “elves”, and indicates more than one elf. In the blog article I was attempting to create a funny sentence from the words found in the puzzle, and I momentarily forgot the correct spelling of elves. The word “elfs” was found in one of the dictionaries that I checked, so I left it in the article for a little humor.

I have updated the article to show my solution.

7 08 2011
Radoslav Golian

Charles,
your solution is definitely more practical than my – I mean the checking dictionary part..
I used utl_http because it was the first solution which came to my mind and it was the fastest to write, but in fact, it’s very inpractical for milions of rows..

In a “real life” I would probably download some free dictionary (aspell, open/libre office), load it to the database (using external table or sqlldr, probably hash cluster table/IOT would be the best structure to load into)
In the pl/sql solution I would query that table – lot of context switching (sql vs pl/sql), but still better and faster than HTTP.
In the sql solution I would join generated words to the loaded table.

Link to a free english (US) directory: http://extensions.services.openoffice.org/en/project/en_US-dict
Click on “Get It!” , save it, add zip extension (It’s a zip archive), en_US.oxt -> en_US.oxt.zip, get file en_US.oxt.zip/en_US.dic (some transformation is needed)

All directories can be found here: http://extensions.services.openoffice.org/en/dictionaries,

8 08 2011
Charles Hooper

Radoslav,

Excellent suggestion regarding downloading the en_US.oxt.zip dictionary file. The following is a Windows VBS script (save with a .vbs extension, and then execute with cscript at a command line) that reads the en_us.dic file, creates a table named US_DICTIONARY, and then inserts the words from the en_us.dic into the US_DICTIONARY table:

Const adCmdText = 1
Const adCmdStoredProc = 4
Const adParamInput = 1
Const adVarNumeric = 139
Const adBigInt = 20
Const adDecimal = 14
Const adDouble = 5
Const adInteger = 3
Const adLongVarBinary = 205
Const adNumeric = 131
Const adSingle = 4
Const adSmallInt = 2
Const adTinyInt = 16
Const adUnsignedBigInt = 21
Const adUnsignedInt = 19
Const adUnsignedSmallInt = 18
Const adUnsignedTinyInt = 17
Const adDate = 7
Const adDBDate = 133
Const adDBTimeStamp = 135
Const adDBTime = 134
Const adVarChar = 200
Const adUseClient = 3
Const adOpenKeyset = 1
Const adLockOptimistic = 3
 
Dim strUsername
Dim strPassword
Dim strDatabase
Dim strLine
Dim objFSO
Dim objFile
Dim strSQL
Dim comData
Dim dbDatabase
 
On Error Resume Next
Set objFSO = CreateObject("Scripting.FileSystemObject") 
Set objFile = objFSO.OpenTextFile("C:\en_US.dic", 1)
Set dbDatabase = CreateObject("ADODB.Connection")
Set comData = CreateObject("ADODB.Command")
 
strUsername = "MyUsername"
strPassword = "MyPassword"
strDatabase = "MyDatabase"
 
dbDatabase.ConnectionString = "Provider=OraOLEDB.Oracle;Data Source=" & strDatabase & ";User ID=" & strUsername & ";Password=" & strPassword & ";FetchSize=5000;"
dbDatabase.Open
 
strSQL = "CREATE TABLE US_DICTIONARY (" & VBCrLf
strSQL = strSQL & "  WORD VARCHAR2(30) PRIMARY KEY)"
dbDatabase.Execute strSQL
 
strSQL = "INSERT INTO" & VBCrLf
strSQL = strSQL & "  US_DICTIONARY" & VBCrLf
strSQL = strSQL & "VALUES" & VBCrLf
strSQL = strSQL & "  ( ? )"
With comData
  'Set up the command properties
  .CommandText = strSQL
  .CommandType = adCmdText
  .CommandTimeout = 30
  .ActiveConnection = dbDatabase
 
  .Parameters.Append .CreateParameter("word", adVarChar, adParamInput, 30, "")
End With
 
dbDatabase.BeginTrans  
Do While Not (objFile.AtEndOfStream)
  strLine = Ucase(cStr(objFile.ReadLine))
  If Instr(strLine, "/") > 0 Then
    strLine = Left(strLine, Instr(strLine, "/") - 1)
  End If
  
  comData("word") = strLine
  comData.Execute
Loop
dbDatabase.CommitTrans
 
objFile.Close
dbDatabase.Close
 
Set objFSO = Nothing
Set objFile = Nothing
Set comData = Nothing
Set dbDatabase = Nothing

With the US_DICTIONARY table created, we can use the table to check words up to 7 characters long with the following SQL statement:

SELECT /*+ LEADING(TW) */
  TW.WORD
FROM
(WITH L AS
  (SELECT
    SUBSTR(L.LETTERS,P.P,1) LETTER,
    P.P
  FROM
    (SELECT
      'ESOIMEFOALEUSAYE' LETTERS
    FROM
      DUAL) L,
    (SELECT
      ROWNUM P
    FROM
      DUAL
    CONNECT BY
      LEVEL<=20) P
  WHERE
    P<=LENGTH(L.LETTERS)
  UNION ALL
  SELECT
    CAST(NULL AS VARCHAR2(1)) LETTER,
    0 P
  FROM
    DUAL)
SELECT DISTINCT
  L1.LETTER || L2.LETTER || L3.LETTER || L4.LETTER || L5.LETTER || L6.LETTER || L7.LETTER WORD
FROM
  L L1,
  L L2,
  L L3,
  L L4,
  L L5,
  L L6,
  L L7
WHERE
  L1.P != L2.P
  AND L1.P != L3.P
  AND L2.P != L3.P
  AND L1.P != 0
  AND L2.P != 0
  AND L3.P != 0
  AND L1.P != L4.P
  AND L2.P != L4.P
  AND L3.P != L4.P
  AND L1.P != L5.P
  AND L2.P != L5.P
  AND L3.P != L5.P
  AND ((L4.P=0 AND L5.P=0)
       OR (L4.P != 0 AND L4.P != L5.P))
  AND L1.P != L6.P
  AND L2.P != L6.P
  AND L3.P != L6.P
  AND ((L4.P=0 AND L6.P=0)
       OR (L4.P != 0 AND L4.P != L6.P))
  AND ((L5.P=0 AND L6.P=0)
       OR (L5.P != 0 AND L5.P != L6.P))
  AND L1.P != L7.P
  AND L2.P != L7.P
  AND L3.P != L7.P
  AND ((L4.P=0 AND L7.P=0)
       OR (L4.P != 0 AND L4.P != L7.P))
  AND ((L5.P=0 AND L7.P=0)
       OR (L5.P != 0 AND L5.P != L7.P))
  AND ((L6.P=0 AND L7.P=0)
       OR (L6.P != 0 AND L6.P != L7.P))) TW,
  US_DICTIONARY UD
WHERE
  TW.WORD=UD.WORD
ORDER BY
  TW.WORD;

The above SQL statement retrieved 560 words.

8 08 2011
Charles Hooper

Here is another way to generate variable length words using the characters ESOIMEFOALEUSAYE. First, we need each character to be on a row by itself with its position in the list of characters identified. We also need a NULL character that will be assigned a position of 0.

SELECT
  SUBSTR(L.LETTERS,P.P,1) LETTER,
  P.P
FROM
  (SELECT
    'ESOIMEFOALEUSAYE' LETTERS
  FROM
    DUAL) L,
  (SELECT
    ROWNUM P
  FROM
    DUAL
  CONNECT BY
    LEVEL<=20) P
WHERE
  P<=LENGTH(L.LETTERS)
UNION ALL
SELECT
  CAST(NULL AS VARCHAR2(1)) LETTER,
  0 P
FROM
  DUAL;
 
L   P
- ---
E   1
S   2
O   3
I   4
M   5
E   6
F   7
O   8
A   9
L  10
E  11
U  12
S  13
A  14
Y  15
E  16
    0

With the above SQL statement slide into a WITH block, we are abel to easily build 3 to 7 character letter groupings using the letters. We need to make certain that a NULL character (P=0) does not appear in the first 3 character positions, and that if the NULL character appears after the third position, all characters after the NULL character are also NULL. We will end up with something like the following:

WITH L AS
  (SELECT
    SUBSTR(L.LETTERS,P.P,1) LETTER,
    P.P
  FROM
    (SELECT
      'ESOIMEFOALEUSAYE' LETTERS
    FROM
      DUAL) L,
    (SELECT
      ROWNUM P
    FROM
      DUAL
    CONNECT BY
      LEVEL<=20) P
  WHERE
    P<=LENGTH(L.LETTERS)
  UNION ALL
  SELECT
    CAST(NULL AS VARCHAR2(1)) LETTER,
    0 P
  FROM
    DUAL)
SELECT DISTINCT
  L1.LETTER || L2.LETTER || L3.LETTER || L4.LETTER || L5.LETTER || L6.LETTER || L7.LETTER WORD
FROM
  L L1,
  L L2,
  L L3,
  L L4,
  L L5,
  L L6,
  L L7
WHERE
  L1.P != L2.P
  AND L1.P != L3.P
  AND L2.P != L3.P
  AND L1.P != 0
  AND L2.P != 0
  AND L3.P != 0
  AND L1.P != L4.P
  AND L2.P != L4.P
  AND L3.P != L4.P
  AND L1.P != L5.P
  AND L2.P != L5.P
  AND L3.P != L5.P
  AND ((L4.P=0 AND L5.P=0)
       OR (L4.P != 0 AND L4.P != L5.P))
  AND L1.P != L6.P
  AND L2.P != L6.P
  AND L3.P != L6.P
  AND ((L4.P=0 AND L6.P=0)
       OR (L4.P != 0 AND L4.P != L6.P))
  AND ((L5.P=0 AND L6.P=0)
       OR (L5.P != 0 AND L5.P != L6.P))
  AND L1.P != L7.P
  AND L2.P != L7.P
  AND L3.P != L7.P
  AND ((L4.P=0 AND L7.P=0)
       OR (L4.P != 0 AND L4.P != L7.P))
  AND ((L5.P=0 AND L7.P=0)
       OR (L5.P != 0 AND L5.P != L7.P))
  AND ((L6.P=0 AND L7.P=0)
       OR (L6.P != 0 AND L6.P != L7.P))
ORDER BY
  WORD;

The SQL statement can be easily plugged into the sample Excel macro that I provided. It appears that this method is a little faster than my PL/SQL function that is included in the article.

13 01 2014
Scrabble Word Finder in SQL » SQLfail

[…] seven tiles, plus linking to one on the board. (credit to Radislav Golian: this is based on his Boggle SQL solver (scroll down the […]

Leave a reply to Radoslav Golian Cancel reply