SQL – Methods of Reformatting into Equivalent Forms 6

5 12 2009

December 5, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A recent post in the comp.database.oracle.server Usenet group asked the following question:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/40d63ab1b4c0a3b4

Is there way to force the randomize within a set of number in Oracle?
Say I have a set of integers ( 2,8,6,5) and I want to force randomize function to randomly pick within that set only. I know about the DBMS_RANDOM package, sample and seed clause but none help what I need to do here.

any thoughts?

It was not clear if the integers will be present in table rows, supplied in a comma separated list, or through another method. Additionally, it was not clear if the solution should be provided in SQL, PL/SQL, or another programming language.  A couple people offered potential solutions.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Vladimir Zakharychev suggested the following:

Straightforward way: just generate random numbers between 0 and 1, divide the 0..1 range into 4 subranges and then pick a number from
your list depending on which subrange your generated random value belongs, like this:

with rnd as (select dbms_random.value val from dual)
select
  case
    when val < 0.25 then 2
    when val >= 0.25 and val < 0.5 then 8
    when val >= 0.5  and val < 0.75 then 6
    when val >= 0.75 then 5
  end x
from rnd

(note that this query does not work correctly in 9.2.0.8 for some reason - returns wrong results. In 10.2.0.4 it returns expected results.) Obviously, you can partition 0..1 range into as many subranges as there are numbers in your list and apply the same technique. A function that will do this automatically given an array of possible return values is not too hard to write.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Tiago offered the following solution:

SET serveroutput ON
DECLARE
Type tabIntegers IS TABLE OF NUMBER ;
  Integers tabIntegers ;
  minInteger NUMBER ;
  maxInteger NUMBER ;
  rndInteger NUMBER ;
  rndIntOk   BOOLEAN := False ;
BEGIN
  integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ;
  FOR i IN integers.first..integers.last
  LOOP
    minInteger := Least(NVL(minInteger,integers(i)), integers(i));
    maxInteger := Greatest(NVL(maxInteger,integers(i)), integers(i));
  END LOOP ;
  LOOP
    rndInteger := TRUNC( dbms_random.value(minInteger,maxInteger) ) ;
    FOR i IN integers.first..integers.last
    LOOP
      IF rndInteger = integers(i) THEN
        rndIntOk   := true ;
        EXIT ;
      END IF ;
    END LOOP ;
    EXIT WHEN rndIntOk ;
  END LOOP ;
  dbms_output.put_line(rndInteger);
END ;
   

Tiago offered a follow-up solution: 

simplified, don't know what I was thinking when did version 1.0.

SET serveroutput ON
DECLARE
Type tabIntegers
IS
  TABLE OF NUMBER ;
  Integers tabIntegers ;
  rndInteger NUMBER ;
BEGIN
  integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ;
  rndInteger := integers(TRUNC( dbms_random.value(1,integers.last) ) ) ;
  dbms_output.put_line(rndInteger);
END ;

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Mark Powell offered the following advice:

CK, instead of using a fixed range as Vladimir had in his example I was thinking I might use a MOD division of the random number returned from dbms_random to generate the index key into my table of values. This would make the code flexible for variable length lists.  If you have a fixed number of entries then Vlad's solution is simple and easy to understand.  If your number of variables varies then I think this would fit the bill.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I offered the following:

Here is another way to do it, if you do not know how many elements will be in the list.

First, we return a row from DUAL with the rows of interest with a comma appended at the start and end:

SELECT
  ',2,8,6,5,54,100,67,7778,6,' ITEMS
FROM
  DUAL;

ITEMS
--------------------------
,2,8,6,5,54,100,67,7778,6,

Next, we need to determine the number of elements and pick an element position at random:

SELECT
  ITEMS,
  SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS,
  (TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM
FROM
  (SELECT
    ',2,8,6,5,54,100,67,7778,6,' ITEMS
  FROM
    DUAL)
CONNECT BY
  LEVEL<20;

ITEMS                       NUM_ITEMS   SEL_ITEM
-------------------------- ---------- ----------
,2,8,6,5,54,100,67,7778,6,          9          6

Finally, we push the above SQL statement into an inline view, search for the specified number of commas according to SEL_ITEM column to determine the starting position of the element, and then search for the next comma to determine the ending position of the element:

SELECT
  ITEMS,
  SEL_ITEM,
  SUBSTR(ITEMS,INSTR(ITEMS,',',1,SEL_ITEM)+1,(INSTR(ITEMS,',', 1,SEL_ITEM+1)) - (INSTR(ITEMS,',',1,SEL_ITEM)) -1) ITEM
FROM
(SELECT
  ITEMS,
  SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS,
  (TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM
FROM
  (SELECT
    ',2,8,6,5,54,100,67,7778,6,' ITEMS
  FROM
    DUAL)
CONNECT BY
  LEVEL<20);

ITEMS                        SEL_ITEM ITEM
-------------------------- ---------- ----
,2,8,6,5,54,100,67,7778,6,          6 100

--

You might also be able to do something like this:

SELECT
  '2,8,6,5,54,100,67,7778,6' ITEMS,
  DBMS_RANDOM.VALUE(0,1) PERCENT
FROM
  DUAL;

ITEMS                       PERCENT
------------------------ ----------
2,8,6,5,54,100,67,7778,6 .582165524

 

SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
  PERCENT
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS,
    DBMS_RANDOM.VALUE(0,1) PERCENT
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL;

ITEM_NUMBER ITEM     PERCENT
----------- ----- ----------
          1 2     .104480002
          2 8      .81670697
          3 6     .826051929
          4 5     .477132421
          5 54     .89077554
          6 100   .640842927
          7 67    .145088893
          8 7778  .252241096
          9 6     .490905924

As you can see from the above, we have a problem in that the random percent changes for each row, which will cause a problem for us if we try to use it in a WHERE clause.

SELECT
  MAX(ITEM_NUMBER) OVER () NUM_ITEMS,
  PERCENT_RANK() OVER (ORDER BY ITEM_NUMBER) PR,
  ITEM_NUMBER,
  ITEM,
  PERCENT
FROM
(SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
  PERCENT
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS,
    DBMS_RANDOM.VALUE(0,1) PERCENT
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL);

NUM_ITEMS         PR ITEM_NUMBER ITEM     PERCENT
--------- ---------- ----------- ----- ----------
        9          0           1 2     .110718377
        9       .125           2 8     .306241972
        9        .25           3 6     .953005936
        9       .375           4 5     .033518415
        9         .5           5 54    .803485415
        9       .625           6 100   .456278133
        9        .75           7 67     .04461405
        9       .875           8 7778  .249680394
        9          1           9 6     .484834331

If we now use a FIRST_VALUE analytic function, we could just retrieve
the first PERCENT value and use that in a WHERE clause (the PERCENT_RANK function was the start of another approach which was never developed):

SELECT
  NUM_ITEMS,
  ITEM_NUMBER,
  ITEM
FROM
(SELECT
  MAX(ITEM_NUMBER) OVER () NUM_ITEMS,
  ITEM_NUMBER,
  ITEM,
  FIRST_VALUE(PERCENT) OVER () PERCENT
FROM
(SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
  PERCENT
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS,
    DBMS_RANDOM.VALUE(0,1) PERCENT
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL))
WHERE
  ITEM_NUMBER=(TRUNC(NUM_ITEMS*PERCENT)+1);

NUM_ITEMS ITEM_NUMBER ITEM
--------- ----------- ----
        9           7 67

One more, which was originally based on my second solution, this time ordering the rows in random order:

SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL
ORDER BY
  DBMS_RANDOM.VALUE(0,1);

ITEM_NUMBER ITEM
----------- ----
          8 7778
          4 5
          9 6
          6 100
          5 54
          2 8
          7 67
          3 6
          1 2

Now, just slide the above into an inline view and retrieve just the first row:

SELECT
  ITEM_NUMBER,
  ITEM
FROM
(SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL
ORDER BY
  DBMS_RANDOM.VALUE(0,1))
WHERE
  ROWNUM=1;

ITEM_NUMBER ITEM
----------- ----
          6 100

There are probably a couple more ways to pick a random element.


Actions

Information

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 139 other followers

%d bloggers like this: