December 5, 2009
A recent thread in the comp.databases.oracle.server Usenet group asked the following:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/fdce5b3d641c5f1a
Table dir_size stores the mbytes of storage used in a given directory. Table directories stores various directory names which may or may not exist in table dir_size.
For every directory in table directories, report the cumulative storage in that directory and all its subdirectories. This solution uses a cartesian join. I imagine it will not scale well.Thanks in advance!
create table dir_size ( dir_name varchar2(40), mbytes number ); create table directories ( dir_name varchar2(40) ); insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100); insert into dir_size values ('c:\aaa\bbb\ccc', 100); insert into dir_size values ('c:\aaa\bbb', 100); insert into dir_size values ('c:\aaa', 100); insert into dir_size values ('c:\', 100); insert into directories values ('c:\aaa\bbb\ccc\ddd'); insert into directories values ('c:\aaa\bbb\ccc'); insert into directories values ('c:\aaa\bbb'); insert into directories values ('c:\aaa'); insert into directories values ('c:\'); insert into directories values ('c:\xxx\yyy\zzz'); commit; select dir_name, sum(mbytes) from ( select directories.dir_name, instr(dir_size.dir_name, directories.dir_name) INSTR, mbytes from directories, dir_size ) where INSTR = 1 group by dir_name order by 1; DIR_NAME SUM(MBYTES) ---------------------------------------- ----------- c:\ 500 c:\aaa 400 c:\aaa\bbb 300 c:\aaa\bbb\ccc 200 c:\aaa\bbb\ccc\ddd 100
This appears to be a hard problem. To avoid headaches, make certain that each of the DIR_NAMES ends with “\”
Let’s start here:
SELECT 'c:\aaa\bbb\ccc\ddd\' DIR_NAME, 100 MBYTES FROM DUAL; DIR_NAME MBYTES -------------------- ---------- c:\aaa\bbb\ccc\ddd\ 100
In your example, you would like to put 100MB into the following directories based on the above:
c:\ c:\aaa\ c:\aaa\bbb\ c:\aaa\bbb\ccc\ c:\aaa\bbb\ccc\ddd\
You somehow need to be able to break that one row into 5 rows. The following might help
SELECT LEVEL L FROM DUAL CONNECT BY LEVEL<=20; L --- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
If we join those two row sources together we might be able to create 5 rows from the one row:
SELECT SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2, MBYTES FROM (SELECT 'c:\aaa\bbb\ccc\ddd\' DIR_NAME, 100 MBYTES FROM DUAL) DIR_SIZE, (SELECT LEVEL L FROM DUAL CONNECT BY LEVEL<=20) C WHERE SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL; DIR_NAME2 MBYTES -------------------- ---------- c:\ 100 c:\aaa\ 100 c:\aaa\bbb\ 100 c:\aaa\bbb\ccc\ 100 c:\aaa\bbb\ccc\ddd\ 100
Now, if we performed the same process for all of the rows in the DIR_SIZE table, grouping on DIR_NAME2, we might be able to find the SUM of the MBYTES column.
(Note that I did not provide an exact/final answer to the original poster – my post was intended to push the OP in the right direction of a solution.)
The OP followed up with this comment:
Thanks for the suggestion. I suspect the best way will involve some kind of recursive processing. The tricky bit is the matching of the rows in the directories table to the rows in the dir_size table. We need to do a “like” (which we can’t, of course) which is why I thought of the instr.
The LIKE keyword is not necessary.
Notice how closely the output of the following SQL statement:
SELECT 'c:\aaa\bbb\ccc\ddd\' DIR_NAME, 100 MBYTES FROM DUAL;
Matches the row created by one of your insert statements:
insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100);
You might try replacing in the above examples:
SELECT 'c:\aaa\bbb\ccc\ddd\' DIR_NAME, 100 MBYTES FROM DUAL;
With a SQL statement that selects all of the rows from your DIR_SIZE table – the results might surprise you IF each of the DIR_NAME values end with a “\”.
You really need more variety in the insert statements to see what is happening, for example:
insert into dir_size values ('c:\ddd\', 800);
insert into dir_size values ('c:\ddd\kkk\', 300);
The first of the above SQL statements will increase the calculated SUM in the c:\ directory by 800, and the second insert statement will increase the SUM in both of the c:\ and c:\ddd\ directories by 300 if you modify my original example to use the DIR_SIZE table rather than the DUAL table.
The final part that I did not provide to the OP is below:
TRUNCATE TABLE DIR_SIZE;
insert into dir_size values ('c:\aaa\bbb\ccc\ddd\', 100);
insert into dir_size values ('c:\aaa\bbb\ccc\', 100);
insert into dir_size values ('c:\aaa\bbb\', 100);
insert into dir_size values ('c:\aaa\', 100);
insert into dir_size values ('c:\', 100);
insert into dir_size values ('c:\ddd\', 800);
insert into dir_size values ('c:\ddd\kkk\', 300);
Working with the hints provided and the final SQL statement in my post, we start with the following:
SELECT SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2, MBYTES FROM (SELECT * FROM DIR_SIZE) DIR_SIZE, (SELECT LEVEL L FROM DUAL CONNECT BY LEVEL<=20) C WHERE SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL; DIR_NAME2 MBYTES -------------------- ---------- c:\ 100 c:\ 100 c:\ 100 c:\ 100 c:\ 100 c:\ 800 c:\ 300 c:\aaa\ 100 c:\aaa\ 100 c:\aaa\ 100 c:\aaa\ 100 c:\ddd\ 800 c:\ddd\ 300 c:\aaa\bbb\ 100 c:\aaa\bbb\ 100 c:\aaa\bbb\ 100 c:\ddd\kkk\ 300 c:\aaa\bbb\ccc\ 100 c:\aaa\bbb\ccc\ 100 c:\aaa\bbb\ccc\ddd\ 100
SELECT SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2, SUM(MBYTES) MBYTES FROM DIR_SIZE, (SELECT LEVEL L FROM DUAL CONNECT BY LEVEL<=20) C WHERE SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL GROUP BY SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) ORDER BY SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)); DIR_NAME2 MBYTES -------------------- ---------- c:\ 1600 c:\aaa\ 400 c:\aaa\bbb\ 300 c:\aaa\bbb\ccc\ 200 c:\aaa\bbb\ccc\ddd\ 100 c:\ddd\ 1100 c:\ddd\kkk\ 300

Recent Comments