SQL – Retain Specific Sort Order

5 12 2009

December 5, 2009

Some time ago the following question appeared in the comp.database.oracle.misc Usenet group: http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/4352a39cab5ab408

Is there a simple way to use a previous result in an ORDER BY clause?

I have a rather complicate query that filters, sorts and returns a series of IDs:

FOO_ID
======
      1
     98
     12
     33

Then, I use these IDs to fetch further information about the items they represent:

SELECT .......
FROM FOO
LEFT JOIN BAR .......
WHERE FOO_ID IN (1, 98, 12, 33)

I keep the two queries separate to avoid excessive complexity. I compose the SQL code using PHP. Right now, the second query comes unsorted from Oracle: I use PHP to sort it at a later stage in my application (my PHP skills are better than my SQL ones). Would it be possible to use the ID list to sort the second query inside Oracle?

Server runs Oracle9i Enterprise Edition Release 9.2.0.1.0

No need to make this too difficult.  If you were on 10g, you could so some fancy things with regexp_substr.  A simple example which should work on 9i and below:
Create a testing table for this demonstration named T1, think of this as your FOO table:

CREATE TABLE T1 AS
SELECT
  ROWNUM C1
FROM
  DUAL
CONNECT BY
  LEVEL<=100;

Now the first step, just retrieve the rows you want:

SELECT
  C1
FROM
  T1
WHERE
  C1 IN (1,98,12,33);

        C1
----------
         1
        12
        33
        98

Now, sort the rows:

SELECT
  C1
FROM
  T1
WHERE
  C1 IN (1,98,12,33)
ORDER BY
  INSTR('1,98,12,33,' , TO_CHAR(C1)||',');

        C1
----------
         1
        98
        12
        33

Note in the INSTR, the sequence of the numbers must end in a comma, and we tell INSTR to locate the number in the list with a comma appended to the end of the value of C1.
Your SQL statement would look like this:

SELECT .......
FROM FOO
LEFT JOIN BAR .......
WHERE FOO_ID IN (1, 98, 12, 33)
ORDER BY
  INSTR(',1,98,12,33,' , ','||TO_CHAR(FOO_ID)||',');

Or:

SELECT .......
FROM
  FOO,
  BAR
WHERE
  FOO.FOO_ID=BAR.FOO_ID(+)
  AND FOO_ID IN (1, 98, 12, 33)
ORDER BY
  INSTR(',1,98,12,33,' , ','||TO_CHAR(FOO_ID)||',');

Before deciding to use the above technique, determine if there is a better way to do *everything* in a single SQL statement.  You might be able to do this by wrapping your complicated SQL statement into an inline view, and joining to that just as if it were a regular table:

SELECT .......
FROM
  FOO,
  BAR,
  ( complicated SQL here ) V
WHERE
  V.FOO_ID=FOO.ID
  AND FOO.FOO_ID=BAR.FOO_ID(+)
ORDER BY
  V.RN;

The RN column would be generated inside the inline view V, possibly like this, if there is an ORDER BY clause in the inline view:

  ROWNUM RN




SQL – Recursive Summing of Related Entities

5 12 2009

December 5, 2009

A recent thread in the comp.databases.oracle.server Usenet group asked the following:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/fdce5b3d641c5f1a

Table dir_size stores the mbytes of storage used in a given directory. Table directories stores various directory names which may or may not exist in table dir_size.
For every directory in table directories, report the cumulative storage in that directory and all its subdirectories. This solution uses a cartesian join.  I imagine it will not scale well.

Thanks in advance!

create table dir_size (
  dir_name     varchar2(40),
  mbytes       number
  );

create table directories (
  dir_name    varchar2(40)
  );

insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100);
insert into dir_size values ('c:\aaa\bbb\ccc', 100);
insert into dir_size values ('c:\aaa\bbb', 100);
insert into dir_size values ('c:\aaa', 100);
insert into dir_size values ('c:\', 100);
insert into directories values ('c:\aaa\bbb\ccc\ddd');
insert into directories values ('c:\aaa\bbb\ccc');
insert into directories values ('c:\aaa\bbb');
insert into directories values ('c:\aaa');
insert into directories values ('c:\');
insert into directories values ('c:\xxx\yyy\zzz');
commit;

select dir_name, sum(mbytes) from (
  select directories.dir_name,
    instr(dir_size.dir_name, directories.dir_name) INSTR,
    mbytes
  from directories, dir_size
)
where INSTR = 1
group by dir_name
order by 1;

DIR_NAME                                 SUM(MBYTES)
---------------------------------------- -----------
c:\                                              500
c:\aaa                                           400
c:\aaa\bbb                                       300
c:\aaa\bbb\ccc                                   200
c:\aaa\bbb\ccc\ddd                               100

This appears to be a hard problem.  To avoid headaches, make certain that each of the DIR_NAMES ends with “\”

Let’s start here:

SELECT
  'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
  100 MBYTES
FROM
  DUAL;

DIR_NAME                 MBYTES
-------------------- ----------
c:\aaa\bbb\ccc\ddd\         100

In your example, you would like to put 100MB into the following directories based on the above:

c:\
c:\aaa\
c:\aaa\bbb\
c:\aaa\bbb\ccc\
c:\aaa\bbb\ccc\ddd\

You somehow need to be able to break that one row into 5 rows.  The following might help

SELECT
  LEVEL L
FROM
  DUAL
CONNECT BY
  LEVEL<=20;

  L
---
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20

If we join those two row sources together we might be able to create 5 rows from the one row:

SELECT
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2,
  MBYTES
FROM
  (SELECT
    'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
    100 MBYTES
  FROM
    DUAL) DIR_SIZE,
  (SELECT
    LEVEL L
  FROM
    DUAL
  CONNECT BY
    LEVEL<=20) C
WHERE
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL;

DIR_NAME2                MBYTES
-------------------- ----------
c:\                         100
c:\aaa\                     100
c:\aaa\bbb\                 100
c:\aaa\bbb\ccc\             100
c:\aaa\bbb\ccc\ddd\         100

Now, if we performed the same process for all of the rows in the DIR_SIZE table, grouping on DIR_NAME2, we might be able to find the SUM of the MBYTES column.

(Note that I did not provide an exact/final answer to the original poster – my post was intended to push the OP in the right direction of a solution.)

The OP followed up with this comment:

Thanks for the suggestion.  I suspect the best way will involve some kind of recursive processing.  The tricky bit is the matching of the rows in the directories table to the rows in the dir_size table.  We need to do a “like” (which we can’t, of course) which is why I thought of the instr.

 The LIKE keyword is not necessary.

Notice how closely the output of the following SQL statement:

SELECT
  'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
  100 MBYTES
FROM
  DUAL;

Matches the row created by one of your insert statements:

insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100);

You might try replacing in the above examples:

SELECT
  'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
  100 MBYTES
FROM
  DUAL;

With a SQL statement that selects all of the rows from your DIR_SIZE table – the results might surprise you IF each of the DIR_NAME values end with a “\”.
You really need more variety in the insert statements to see what is happening, for example:

insert into dir_size values ('c:\ddd\', 800);
insert into dir_size values ('c:\ddd\kkk\', 300);

The first of the above SQL statements will increase the calculated SUM in the c:\ directory by 800, and the second insert statement will increase the SUM in both of the c:\ and c:\ddd\ directories by 300 if you modify my original example to use the DIR_SIZE table rather than the DUAL table.
The final part that I did not provide to the OP is below:

TRUNCATE TABLE DIR_SIZE;

insert into dir_size values ('c:\aaa\bbb\ccc\ddd\', 100);
insert into dir_size values ('c:\aaa\bbb\ccc\', 100);
insert into dir_size values ('c:\aaa\bbb\', 100);
insert into dir_size values ('c:\aaa\', 100);
insert into dir_size values ('c:\', 100);
insert into dir_size values ('c:\ddd\', 800);
insert into dir_size values ('c:\ddd\kkk\', 300);

Working with the hints provided and the final SQL statement in my post, we start with the following:

SELECT
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2,
  MBYTES
FROM
  (SELECT
    *
  FROM
    DIR_SIZE) DIR_SIZE,
  (SELECT
    LEVEL L
  FROM
    DUAL
  CONNECT BY
    LEVEL<=20) C
WHERE
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL;

DIR_NAME2                MBYTES
-------------------- ----------
c:\                         100
c:\                         100
c:\                         100
c:\                         100
c:\                         100
c:\                         800
c:\                         300
c:\aaa\                     100
c:\aaa\                     100
c:\aaa\                     100
c:\aaa\                     100
c:\ddd\                     800
c:\ddd\                     300
c:\aaa\bbb\                 100
c:\aaa\bbb\                 100
c:\aaa\bbb\                 100
c:\ddd\kkk\                 300
c:\aaa\bbb\ccc\             100
c:\aaa\bbb\ccc\             100
c:\aaa\bbb\ccc\ddd\         100

 

SELECT
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2,
  SUM(MBYTES) MBYTES
FROM
  DIR_SIZE,
  (SELECT
    LEVEL L
  FROM
    DUAL
  CONNECT BY
    LEVEL<=20) C
WHERE
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL
GROUP BY
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L))
ORDER BY
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L));

DIR_NAME2                MBYTES
-------------------- ----------
c:\                        1600
c:\aaa\                     400
c:\aaa\bbb\                 300
c:\aaa\bbb\ccc\             200
c:\aaa\bbb\ccc\ddd\         100
c:\ddd\                    1100
c:\ddd\kkk\                 300




SQL – Methods of Reformatting into Equivalent Forms 6

5 12 2009

December 5, 2009

(Back to the Previous Post in the Series) (Forward to the Next Post in the Series)

A recent post in the comp.database.oracle.server Usenet group asked the following question:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/40d63ab1b4c0a3b4

Is there way to force the randomize within a set of number in Oracle?
Say I have a set of integers ( 2,8,6,5) and I want to force randomize function to randomly pick within that set only. I know about the DBMS_RANDOM package, sample and seed clause but none help what I need to do here.

any thoughts?

It was not clear if the integers will be present in table rows, supplied in a comma separated list, or through another method. Additionally, it was not clear if the solution should be provided in SQL, PL/SQL, or another programming language.  A couple people offered potential solutions.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Vladimir Zakharychev suggested the following:

Straightforward way: just generate random numbers between 0 and 1, divide the 0..1 range into 4 subranges and then pick a number from
your list depending on which subrange your generated random value belongs, like this:

with rnd as (select dbms_random.value val from dual)
select
  case
    when val < 0.25 then 2
    when val >= 0.25 and val < 0.5 then 8
    when val >= 0.5  and val < 0.75 then 6
    when val >= 0.75 then 5
  end x
from rnd

(note that this query does not work correctly in 9.2.0.8 for some reason – returns wrong results. In 10.2.0.4 it returns expected results.) Obviously, you can partition 0..1 range into as many subranges as there are numbers in your list and apply the same technique. A function that will do this automatically given an array of possible return values is not too hard to write.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Tiago offered the following solution:

SET serveroutput ON
DECLARE
Type tabIntegers IS TABLE OF NUMBER ;
  Integers tabIntegers ;
  minInteger NUMBER ;
  maxInteger NUMBER ;
  rndInteger NUMBER ;
  rndIntOk   BOOLEAN := False ;
BEGIN
  integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ;
  FOR i IN integers.first..integers.last
  LOOP
    minInteger := Least(NVL(minInteger,integers(i)), integers(i));
    maxInteger := Greatest(NVL(maxInteger,integers(i)), integers(i));
  END LOOP ;
  LOOP
    rndInteger := TRUNC( dbms_random.value(minInteger,maxInteger) ) ;
    FOR i IN integers.first..integers.last
    LOOP
      IF rndInteger = integers(i) THEN
        rndIntOk   := true ;
        EXIT ;
      END IF ;
    END LOOP ;
    EXIT WHEN rndIntOk ;
  END LOOP ;
  dbms_output.put_line(rndInteger);
END ;
   

Tiago offered a follow-up solution: 

simplified, don’t know what I was thinking when did version 1.0.

SET serveroutput ON
DECLARE
Type tabIntegers
IS
  TABLE OF NUMBER ;
  Integers tabIntegers ;
  rndInteger NUMBER ;
BEGIN
  integers := tabIntegers( 2, 5, 10, 7, 3, 8, 11, 25, 0 ) ;
  rndInteger := integers(TRUNC( dbms_random.value(1,integers.last) ) ) ;
  dbms_output.put_line(rndInteger);
END ;

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Mark Powell offered the following advice:

CK, instead of using a fixed range as Vladimir had in his example I was thinking I might use a MOD division of the random number returned from dbms_random to generate the index key into my table of values. This would make the code flexible for variable length lists.  If you have a fixed number of entries then Vlad’s solution is simple and easy to understand.  If your number of variables varies then I think this would fit the bill.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I offered the following:

Here is another way to do it, if you do not know how many elements will be in the list.

First, we return a row from DUAL with the rows of interest with a comma appended at the start and end:

SELECT
  ',2,8,6,5,54,100,67,7778,6,' ITEMS
FROM
  DUAL;

ITEMS
--------------------------
,2,8,6,5,54,100,67,7778,6,

Next, we need to determine the number of elements and pick an element position at random:

SELECT
  ITEMS,
  SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS,
  (TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM
FROM
  (SELECT
    ',2,8,6,5,54,100,67,7778,6,' ITEMS
  FROM
    DUAL)
CONNECT BY
  LEVEL<20;

ITEMS                       NUM_ITEMS   SEL_ITEM
-------------------------- ---------- ----------
,2,8,6,5,54,100,67,7778,6,          9          6

Finally, we push the above SQL statement into an inline view, search for the specified number of commas according to SEL_ITEM column to determine the starting position of the element, and then search for the next comma to determine the ending position of the element:

SELECT
  ITEMS,
  SEL_ITEM,
  SUBSTR(ITEMS,INSTR(ITEMS,',',1,SEL_ITEM)+1,(INSTR(ITEMS,',', 1,SEL_ITEM+1)) - (INSTR(ITEMS,',',1,SEL_ITEM)) -1) ITEM
FROM
(SELECT
  ITEMS,
  SUM(SIGN(INSTR(ITEMS, ',',1,ROWNUM)))-1 NUM_ITEMS,
  (TRUNC(DBMS_RANDOM.VALUE(0,SUM(SIGN(INSTR(ITEMS ,',',1,ROWNUM)))-1)) +1) SEL_ITEM
FROM
  (SELECT
    ',2,8,6,5,54,100,67,7778,6,' ITEMS
  FROM
    DUAL)
CONNECT BY
  LEVEL<20);

ITEMS                        SEL_ITEM ITEM
-------------------------- ---------- ----
,2,8,6,5,54,100,67,7778,6,          6 100

You might also be able to do something like this:

SELECT
  '2,8,6,5,54,100,67,7778,6' ITEMS,
  DBMS_RANDOM.VALUE(0,1) PERCENT
FROM
  DUAL;

ITEMS                       PERCENT
------------------------ ----------
2,8,6,5,54,100,67,7778,6 .582165524

 

SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
  PERCENT
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS,
    DBMS_RANDOM.VALUE(0,1) PERCENT
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL;

ITEM_NUMBER ITEM     PERCENT
----------- ----- ----------
          1 2     .104480002
          2 8      .81670697
          3 6     .826051929
          4 5     .477132421
          5 54     .89077554
          6 100   .640842927
          7 67    .145088893
          8 7778  .252241096
          9 6     .490905924

As you can see from the above, we have a problem in that the random percent changes for each row, which will cause a problem for us if we try to use it in a WHERE clause.

SELECT
  MAX(ITEM_NUMBER) OVER () NUM_ITEMS,
  PERCENT_RANK() OVER (ORDER BY ITEM_NUMBER) PR,
  ITEM_NUMBER,
  ITEM,
  PERCENT
FROM
(SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
  PERCENT
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS,
    DBMS_RANDOM.VALUE(0,1) PERCENT
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL);

NUM_ITEMS         PR ITEM_NUMBER ITEM     PERCENT
--------- ---------- ----------- ----- ----------
        9          0           1 2     .110718377
        9       .125           2 8     .306241972
        9        .25           3 6     .953005936
        9       .375           4 5     .033518415
        9         .5           5 54    .803485415
        9       .625           6 100   .456278133
        9        .75           7 67     .04461405
        9       .875           8 7778  .249680394
        9          1           9 6     .484834331

If we now use a FIRST_VALUE analytic function, we could just retrieve
the first PERCENT value and use that in a WHERE clause (the PERCENT_RANK function was the start of another approach which was never developed):

SELECT
  NUM_ITEMS,
  ITEM_NUMBER,
  ITEM
FROM
(SELECT
  MAX(ITEM_NUMBER) OVER () NUM_ITEMS,
  ITEM_NUMBER,
  ITEM,
  FIRST_VALUE(PERCENT) OVER () PERCENT
FROM
(SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM,
  PERCENT
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS,
    DBMS_RANDOM.VALUE(0,1) PERCENT
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL))
WHERE
  ITEM_NUMBER=(TRUNC(NUM_ITEMS*PERCENT)+1);

NUM_ITEMS ITEM_NUMBER ITEM
--------- ----------- ----
        9           7 67

One more, which was originally based on my second solution, this time ordering the rows in random order:

SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL
ORDER BY
  DBMS_RANDOM.VALUE(0,1);

ITEM_NUMBER ITEM
----------- ----
          8 7778
          4 5
          9 6
          6 100
          5 54
          2 8
          7 67
          3 6
          1 2

Now, just slide the above into an inline view and retrieve just the first row:

SELECT
  ITEM_NUMBER,
  ITEM
FROM
(SELECT
  ROWNUM ITEM_NUMBER,
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) ITEM
FROM
  (SELECT
    '2,8,6,5,54,100,67,7778,6' ITEMS
  FROM
    DUAL)
CONNECT BY
  REGEXP_SUBSTR(ITEMS,'\w+',1,LEVEL) IS NOT NULL
ORDER BY
  DBMS_RANDOM.VALUE(0,1))
WHERE
  ROWNUM=1;

ITEM_NUMBER ITEM
----------- ----
          6 100

There are probably a couple more ways to pick a random element.








Follow

Get every new post delivered to your Inbox.

Join 139 other followers