December 5, 2009
(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)
A recent post in the comp.database.oracle.server Usenet group asked the following question:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/40d63ab1b4c0a3b4
Is there way to force the randomize within a set of number in Oracle?
Say I have a set of integers ( 2,8,6,5) and I want to force randomize function to randomly pick within that set only. I know about the DBMS_RANDOM package, sample and seed clause but none help what I need to do here.
any thoughts?
It was not clear if the integers will be present in table rows, supplied in a comma separated list, or through another method. Additionally, it was not clear if the solution should be provided in SQL, PL/SQL, or another programming language. A couple people offered potential solutions.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Vladimir Zakharychev suggested the following:
Straightforward way: just generate random numbers between 0 and 1, divide the 0..1 range into 4 subranges and then pick a number from
your list depending on which subrange your generated random value belongs, like this:
with rnd as (select dbms_random.value val from dual)
select
case
when val < 0.25 then 2
when val >= 0.25 and val < 0.5 then 8
when val >= 0.5 and val < 0.75 then 6
when val >= 0.75 then 5
end x
from rnd
(note that this query does not work correctly in 9.2.0.8 for some reason – returns wrong results. In 10.2.0.4 it returns expected results.) Obviously, you can partition 0..1 range into as many subranges as there are numbers in your list and apply the same technique. A function that will do this automatically given an array of possible return values is not too hard to write.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tiago offered the following solution:
SET serveroutput ON
DECLARE
Type tabIntegers IS TABLE OF NUMBER ;
Integers tabIntegers ;
minInteger NUMBER ;
maxInteger NUMBER ;
rndInteger NUMBER ;
rndIntOk BOOLEAN := False ;
BEGIN
integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ;
FOR i IN integers.first..integers.last
LOOP
minInteger := Least(NVL(minInteger,integers(i)), integers(i));
maxInteger := Greatest(NVL(maxInteger,integers(i)), integers(i));
END LOOP ;
LOOP
rndInteger := TRUNC( dbms_random.value(minInteger,maxInteger) ) ;
FOR i IN integers.first..integers.last
LOOP
IF rndInteger = integers(i) THEN
rndIntOk := true ;
EXIT ;
END IF ;
END LOOP ;
EXIT WHEN rndIntOk ;
END LOOP ;
dbms_output.put_line(rndInteger);
END ;
Tiago offered a follow-up solution:
simplified, don’t know what I was thinking when did version 1.0.
SET serveroutput ON
DECLARE
Type tabIntegers
IS
TABLE OF NUMBER ;
Integers tabIntegers ;
rndInteger NUMBER ;
BEGIN
integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ;
rndInteger := integers(TRUNC( dbms_random.value(1,integers.last) ) ) ;
dbms_output.put_line(rndInteger);
END ;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mark Powell offered the following advice:
CK, instead of using a fixed range as Vladimir had in his example I was thinking I might use a MOD division of the random number returned from dbms_random to generate the index key into my table of values. This would make the code flexible for variable length lists. If you have a fixed number of entries then Vlad’s solution is simple and easy to understand. If your number of variables varies then I think this would fit the bill.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I offered the following:
Here is another way to do it, if you do not know how many elements will be in the list.
First, we return a row from DUAL with the rows of interest with a comma appended at the start and end:
SELECT
',2,8,6,5,54,100,67,7778,6,' ITEMS
FROM
DUAL;
ITEMS
--------------------------
,2,8,6,5,54,100,67,7778,6,
Next, we need to determine the number of elements and pick an element position at random:
SELECT
ITEMS,
SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS,
(TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM
FROM
(SELECT
',2,8,6,5,54,100,67,7778,6,' ITEMS
FROM
DUAL)
CONNECT BY
LEVEL<20;
ITEMS NUM_ITEMS SEL_ITEM
-------------------------- ---------- ----------
,2,8,6,5,54,100,67,7778,6, 9 6
Finally, we push the above SQL statement into an inline view, search for the specified number of commas according to SEL_ITEM column to determine the starting position of the element, and then search for the next comma to determine the ending position of the element:
SELECT
ITEMS,
SEL_ITEM,
SUBSTR(ITEMS,INSTR(ITEMS,',',1,SEL_ITEM)+1,(INSTR(ITEMS,',', 1,SEL_ITEM+1)) - (INSTR(ITEMS,',',1,SEL_ITEM)) -1) ITEM
FROM
(SELECT
ITEMS,
SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS,
(TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM
FROM
(SELECT
',2,8,6,5,54,100,67,7778,6,' ITEMS
FROM
DUAL)
CONNECT BY
LEVEL<20);
ITEMS SEL_ITEM ITEM
-------------------------- ---------- ----
,2,8,6,5,54,100,67,7778,6, 6 100
—
You might also be able to do something like this:
SELECT
'2,8,6,5,54,100,67,7778,6' ITEMS,
DBMS_RANDOM.VALUE(0,1) PERCENT
FROM
DUAL;
ITEMS PERCENT
------------------------ ----------
2,8,6,5,54,100,67,7778,6 .582165524
SELECT
ROWNUM ITEM_NUMBER,
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
PERCENT
FROM
(SELECT
'2,8,6,5,54,100,67,7778,6' ITEMS,
DBMS_RANDOM.VALUE(0,1) PERCENT
FROM
DUAL)
CONNECT BY
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL;
ITEM_NUMBER ITEM PERCENT
----------- ----- ----------
1 2 .104480002
2 8 .81670697
3 6 .826051929
4 5 .477132421
5 54 .89077554
6 100 .640842927
7 67 .145088893
8 7778 .252241096
9 6 .490905924
As you can see from the above, we have a problem in that the random percent changes for each row, which will cause a problem for us if we try to use it in a WHERE clause.
SELECT
MAX(ITEM_NUMBER) OVER () NUM_ITEMS,
PERCENT_RANK() OVER (ORDER BY ITEM_NUMBER) PR,
ITEM_NUMBER,
ITEM,
PERCENT
FROM
(SELECT
ROWNUM ITEM_NUMBER,
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
PERCENT
FROM
(SELECT
'2,8,6,5,54,100,67,7778,6' ITEMS,
DBMS_RANDOM.VALUE(0,1) PERCENT
FROM
DUAL)
CONNECT BY
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL);
NUM_ITEMS PR ITEM_NUMBER ITEM PERCENT
--------- ---------- ----------- ----- ----------
9 0 1 2 .110718377
9 .125 2 8 .306241972
9 .25 3 6 .953005936
9 .375 4 5 .033518415
9 .5 5 54 .803485415
9 .625 6 100 .456278133
9 .75 7 67 .04461405
9 .875 8 7778 .249680394
9 1 9 6 .484834331
If we now use a FIRST_VALUE analytic function, we could just retrieve
the first PERCENT value and use that in a WHERE clause (the PERCENT_RANK function was the start of another approach which was never developed):
SELECT
NUM_ITEMS,
ITEM_NUMBER,
ITEM
FROM
(SELECT
MAX(ITEM_NUMBER) OVER () NUM_ITEMS,
ITEM_NUMBER,
ITEM,
FIRST_VALUE(PERCENT) OVER () PERCENT
FROM
(SELECT
ROWNUM ITEM_NUMBER,
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
PERCENT
FROM
(SELECT
'2,8,6,5,54,100,67,7778,6' ITEMS,
DBMS_RANDOM.VALUE(0,1) PERCENT
FROM
DUAL)
CONNECT BY
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL))
WHERE
ITEM_NUMBER=(TRUNC(NUM_ITEMS*PERCENT)+1);
NUM_ITEMS ITEM_NUMBER ITEM
--------- ----------- ----
9 7 67
One more, which was originally based on my second solution, this time ordering the rows in random order:
SELECT
ROWNUM ITEM_NUMBER,
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM
FROM
(SELECT
'2,8,6,5,54,100,67,7778,6' ITEMS
FROM
DUAL)
CONNECT BY
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL
ORDER BY
DBMS_RANDOM.VALUE(0,1);
ITEM_NUMBER ITEM
----------- ----
8 7778
4 5
9 6
6 100
5 54
2 8
7 67
3 6
1 2
Now, just slide the above into an inline view and retrieve just the first row:
SELECT
ITEM_NUMBER,
ITEM
FROM
(SELECT
ROWNUM ITEM_NUMBER,
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM
FROM
(SELECT
'2,8,6,5,54,100,67,7778,6' ITEMS
FROM
DUAL)
CONNECT BY
REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL
ORDER BY
DBMS_RANDOM.VALUE(0,1))
WHERE
ROWNUM=1;
ITEM_NUMBER ITEM
----------- ----
6 100
There are probably a couple more ways to pick a random element.
Recent Comments