December 5, 2009
(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)
A recent post in the comp.database.oracle.server Usenet group asked the following question:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/40d63ab1b4c0a3b4
Is there way to force the randomize within a set of number in Oracle?
Say I have a set of integers ( 2,8,6,5) and I want to force randomize function to randomly pick within that set only. I know about the DBMS_RANDOM package, sample and seed clause but none help what I need to do here.any thoughts?
It was not clear if the integers will be present in table rows, supplied in a comma separated list, or through another method. Additionally, it was not clear if the solution should be provided in SQL, PL/SQL, or another programming language. A couple people offered potential solutions.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Vladimir Zakharychev suggested the following:
Straightforward way: just generate random numbers between 0 and 1, divide the 0..1 range into 4 subranges and then pick a number from
your list depending on which subrange your generated random value belongs, like this:
with rnd as (select dbms_random.value val from dual) select case when val < 0.25 then 2 when val >= 0.25 and val < 0.5 then 8 when val >= 0.5 and val < 0.75 then 6 when val >= 0.75 then 5 end x from rnd
(note that this query does not work correctly in 9.2.0.8 for some reason – returns wrong results. In 10.2.0.4 it returns expected results.) Obviously, you can partition 0..1 range into as many subranges as there are numbers in your list and apply the same technique. A function that will do this automatically given an array of possible return values is not too hard to write.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tiago offered the following solution:
SET serveroutput ON DECLARE Type tabIntegers IS TABLE OF NUMBER ; Integers tabIntegers ; minInteger NUMBER ; maxInteger NUMBER ; rndInteger NUMBER ; rndIntOk BOOLEAN := False ; BEGIN integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ; FOR i IN integers.first..integers.last LOOP minInteger := Least(NVL(minInteger,integers(i)), integers(i)); maxInteger := Greatest(NVL(maxInteger,integers(i)), integers(i)); END LOOP ; LOOP rndInteger := TRUNC( dbms_random.value(minInteger,maxInteger) ) ; FOR i IN integers.first..integers.last LOOP IF rndInteger = integers(i) THEN rndIntOk := true ; EXIT ; END IF ; END LOOP ; EXIT WHEN rndIntOk ; END LOOP ; dbms_output.put_line(rndInteger); END ;
Tiago offered a follow-up solution:
simplified, don’t know what I was thinking when did version 1.0.
SET serveroutput ON DECLARE Type tabIntegers IS TABLE OF NUMBER ; Integers tabIntegers ; rndInteger NUMBER ; BEGIN integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ; rndInteger := integers(TRUNC( dbms_random.value(1,integers.last) ) ) ; dbms_output.put_line(rndInteger); END ;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mark Powell offered the following advice:
CK, instead of using a fixed range as Vladimir had in his example I was thinking I might use a MOD division of the random number returned from dbms_random to generate the index key into my table of values. This would make the code flexible for variable length lists. If you have a fixed number of entries then Vlad’s solution is simple and easy to understand. If your number of variables varies then I think this would fit the bill.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I offered the following:
Here is another way to do it, if you do not know how many elements will be in the list.
First, we return a row from DUAL with the rows of interest with a comma appended at the start and end:
SELECT ',2,8,6,5,54,100,67,7778,6,' ITEMS FROM DUAL; ITEMS -------------------------- ,2,8,6,5,54,100,67,7778,6,
Next, we need to determine the number of elements and pick an element position at random:
SELECT ITEMS, SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS, (TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM FROM (SELECT ',2,8,6,5,54,100,67,7778,6,' ITEMS FROM DUAL) CONNECT BY LEVEL<20; ITEMS NUM_ITEMS SEL_ITEM -------------------------- ---------- ---------- ,2,8,6,5,54,100,67,7778,6, 9 6
Finally, we push the above SQL statement into an inline view, search for the specified number of commas according to SEL_ITEM column to determine the starting position of the element, and then search for the next comma to determine the ending position of the element:
SELECT ITEMS, SEL_ITEM, SUBSTR(ITEMS,INSTR(ITEMS,',',1,SEL_ITEM)+1,(INSTR(ITEMS,',', 1,SEL_ITEM+1)) - (INSTR(ITEMS,',',1,SEL_ITEM)) -1) ITEM FROM (SELECT ITEMS, SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS, (TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM FROM (SELECT ',2,8,6,5,54,100,67,7778,6,' ITEMS FROM DUAL) CONNECT BY LEVEL<20); ITEMS SEL_ITEM ITEM -------------------------- ---------- ---- ,2,8,6,5,54,100,67,7778,6, 6 100
—
You might also be able to do something like this:
SELECT '2,8,6,5,54,100,67,7778,6' ITEMS, DBMS_RANDOM.VALUE(0,1) PERCENT FROM DUAL; ITEMS PERCENT ------------------------ ---------- 2,8,6,5,54,100,67,7778,6 .582165524
SELECT ROWNUM ITEM_NUMBER, REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM, PERCENT FROM (SELECT '2,8,6,5,54,100,67,7778,6' ITEMS, DBMS_RANDOM.VALUE(0,1) PERCENT FROM DUAL) CONNECT BY REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL; ITEM_NUMBER ITEM PERCENT ----------- ----- ---------- 1 2 .104480002 2 8 .81670697 3 6 .826051929 4 5 .477132421 5 54 .89077554 6 100 .640842927 7 67 .145088893 8 7778 .252241096 9 6 .490905924
As you can see from the above, we have a problem in that the random percent changes for each row, which will cause a problem for us if we try to use it in a WHERE clause.
SELECT MAX(ITEM_NUMBER) OVER () NUM_ITEMS, PERCENT_RANK() OVER (ORDER BY ITEM_NUMBER) PR, ITEM_NUMBER, ITEM, PERCENT FROM (SELECT ROWNUM ITEM_NUMBER, REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM, PERCENT FROM (SELECT '2,8,6,5,54,100,67,7778,6' ITEMS, DBMS_RANDOM.VALUE(0,1) PERCENT FROM DUAL) CONNECT BY REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL); NUM_ITEMS PR ITEM_NUMBER ITEM PERCENT --------- ---------- ----------- ----- ---------- 9 0 1 2 .110718377 9 .125 2 8 .306241972 9 .25 3 6 .953005936 9 .375 4 5 .033518415 9 .5 5 54 .803485415 9 .625 6 100 .456278133 9 .75 7 67 .04461405 9 .875 8 7778 .249680394 9 1 9 6 .484834331
If we now use a FIRST_VALUE analytic function, we could just retrieve
the first PERCENT value and use that in a WHERE clause (the PERCENT_RANK function was the start of another approach which was never developed):
SELECT NUM_ITEMS, ITEM_NUMBER, ITEM FROM (SELECT MAX(ITEM_NUMBER) OVER () NUM_ITEMS, ITEM_NUMBER, ITEM, FIRST_VALUE(PERCENT) OVER () PERCENT FROM (SELECT ROWNUM ITEM_NUMBER, REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM, PERCENT FROM (SELECT '2,8,6,5,54,100,67,7778,6' ITEMS, DBMS_RANDOM.VALUE(0,1) PERCENT FROM DUAL) CONNECT BY REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL)) WHERE ITEM_NUMBER=(TRUNC(NUM_ITEMS*PERCENT)+1); NUM_ITEMS ITEM_NUMBER ITEM --------- ----------- ---- 9 7 67
One more, which was originally based on my second solution, this time ordering the rows in random order:
SELECT ROWNUM ITEM_NUMBER, REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM FROM (SELECT '2,8,6,5,54,100,67,7778,6' ITEMS FROM DUAL) CONNECT BY REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL ORDER BY DBMS_RANDOM.VALUE(0,1); ITEM_NUMBER ITEM ----------- ---- 8 7778 4 5 9 6 6 100 5 54 2 8 7 67 3 6 1 2
Now, just slide the above into an inline view and retrieve just the first row:
SELECT ITEM_NUMBER, ITEM FROM (SELECT ROWNUM ITEM_NUMBER, REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM FROM (SELECT '2,8,6,5,54,100,67,7778,6' ITEMS FROM DUAL) CONNECT BY REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL ORDER BY DBMS_RANDOM.VALUE(0,1)) WHERE ROWNUM=1; ITEM_NUMBER ITEM ----------- ---- 6 100
There are probably a couple more ways to pick a random element.
Leave a Reply