December 26, 2009
A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/9af4117466316d9a
I have to following problem:
I get X rows from a statement, these are sorted by a certain column, let’s say a numerical value.
Now I want to calculate the average of this numerical value, but the 10% with the lowest and the 10% with the highest value shall not be included in this calculation. So for example, if I get 20 rows, I need the average of the value in rows 3 to 18.Currently I solved this with a very complicated statement, but I don’t know the built-in Oracle mathematical functions so I hope that there could be a way to do this with a better performance.
Let’s set up a short experiment:
CREATE TABLE T1 (C1 NUMBER(4)); INSERT INTO T1 SELECT ROWNUM FROM DUAL CONNECT BY LEVEL<=20;
We now have a table with 20 rows with numbers between 1 and 20.
Assuming that you are running a version of Oracle that supports analytical functions, the following returns the twenty rows with the relative ranking of each row, if the rows are sorted by C1 in descending order:
SELECT C1, DENSE_RANK() OVER (ORDER BY C1 DESC) DR, COUNT(C1) OVER (PARTITION BY 1) R FROM T1; C1 DR R ---------- ---------- ---------- 20 1 20 19 2 20 18 3 20 17 4 20 16 5 20 15 6 20 14 7 20 13 8 20 12 9 20 11 10 20 10 11 20 9 12 20 8 13 20 7 14 20 6 15 20 5 16 20 4 17 20 3 18 20 2 19 20 1 20 20
A slight modification of the above, dividing the value of DENSE_RANK by the value of COUNT, and also including a PERCENT_RANK for comparison:
SELECT C1, DENSE_RANK() OVER (ORDER BY C1 DESC) DR, (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP, PERCENT_RANK() OVER (ORDER BY C1 DESC) PR FROM T1; C1 DR DRP PR ---------- ---------- ---------- ---------- 20 1 .05 0 19 2 .1 .052631579 18 3 .15 .105263158 17 4 .2 .157894737 16 5 .25 .210526316 15 6 .3 .263157895 14 7 .35 .315789474 13 8 .4 .368421053 12 9 .45 .421052632 11 10 .5 .473684211 10 11 .55 .526315789 9 12 .6 .578947368 8 13 .65 .631578947 7 14 .7 .684210526 6 15 .75 .736842105 5 16 .8 .789473684 4 17 .85 .842105263 3 18 .9 .894736842 2 19 .95 .947368421 1 20 1 1
The final cleanup is performed when the above is slid into an inline view, by using a WHERE clause:
SELECT SUM(T.C1) S FROM (SELECT C1, DENSE_RANK() OVER (ORDER BY C1 DESC) DR, (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP, PERCENT_RANK() OVER (ORDER BY C1 DESC) PR FROM T1) T WHERE T.DRP>0.1 AND T.DRP<=0.9;
S
———-
168
A version that uses the PERCENT_RANK value:
SELECT SUM(T.C1) S FROM (SELECT C1, DENSE_RANK() OVER (ORDER BY C1 DESC) DR, (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP, PERCENT_RANK() OVER (ORDER BY C1 DESC) PR FROM T1) T WHERE T.PR BETWEEN 0.1 AND 0.9;
S
———-
168
who to use DENSE_RANK() for the following table
Considering
COLA COLB RANK
1 91 1
2 92 1
3 93 6
3 94 5
3 95 4
3 96 3
3 97 2
3 98 1
4 99 2
4 100 1
5 101 1
Now i want to display just those columns which as RANK=1, there should not be rank 2 or 3
that is result should be like this
COLA COLB RANK
1 91 1
2 92 1
5 101 1
thanks in adv.
You will need to put your current SQL statement into an inline view, and then add a WHERE clause outside/after the inline view. There is an example of this approach on this web page, in the second and third SELECT statements. The second SELECT statement begins like this:
I then put that SELECT statement into an inline view and added a WHERE clause outside/after the inline view. The third SELECT statement begins like this: