SQL – DENSE_RANK, PERCENT_RANK, and COUNT Analytical Functions

26 12 2009

December 26, 2009

A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/9af4117466316d9a

I have to following problem:

I get X rows from a statement, these are sorted by a certain column, let’s say a numerical value.
Now I want to calculate the average of this numerical value, but the 10% with the lowest and the 10% with the highest value shall not be included in this calculation. So for example, if I get 20 rows, I need the average of the value in rows 3 to 18.

Currently I solved this with a very complicated statement, but I don’t know the built-in Oracle mathematical functions so I hope that there could be a way to do this with a better performance.

Let’s set up a short experiment:

```CREATE TABLE T1 (C1 NUMBER(4));

INSERT INTO T1
SELECT
ROWNUM
FROM
DUAL
CONNECT BY
LEVEL<=20;```

We now have a table with 20 rows with numbers between 1 and 20.

Assuming that you are running a version of Oracle that supports analytical functions, the following returns the twenty rows with the relative ranking of each row, if the rows are sorted by C1 in descending order:

```SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
COUNT(C1) OVER (PARTITION BY 1) R
FROM
T1;

C1         DR          R
---------- ---------- ----------
20          1         20
19          2         20
18          3         20
17          4         20
16          5         20
15          6         20
14          7         20
13          8         20
12          9         20
11         10         20
10         11         20
9         12         20
8         13         20
7         14         20
6         15         20
5         16         20
4         17         20
3         18         20
2         19         20
1         20         20```

A slight modification of the above, dividing the value of DENSE_RANK by the value of COUNT, and also including a PERCENT_RANK for comparison:

```SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
(DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
T1;

C1         DR        DRP         PR
---------- ---------- ---------- ----------
20          1        .05          0
19          2         .1 .052631579
18          3        .15 .105263158
17          4         .2 .157894737
16          5        .25 .210526316
15          6         .3 .263157895
14          7        .35 .315789474
13          8         .4 .368421053
12          9        .45 .421052632
11         10         .5 .473684211
10         11        .55 .526315789
9         12         .6 .578947368
8         13        .65 .631578947
7         14         .7 .684210526
6         15        .75 .736842105
5         16         .8 .789473684
4         17        .85 .842105263
3         18         .9 .894736842
2         19        .95 .947368421
1         20          1          1```

The final cleanup is performed when the above is slid into an inline view, by using a WHERE clause:

```SELECT
SUM(T.C1) S
FROM
(SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
(DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
T1) T
WHERE
T.DRP>0.1
AND T.DRP<=0.9;```

S
———-
168

A version that uses the PERCENT_RANK value:

```SELECT
SUM(T.C1) S
FROM
(SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
(DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
T1) T
WHERE
T.PR BETWEEN 0.1 AND 0.9;```

S
———-
168

2 responses

3 01 2014

who to use DENSE_RANK() for the following table
Considering
COLA COLB RANK
1 91 1
2 92 1
3 93 6
3 94 5
3 95 4
3 96 3
3 97 2
3 98 1
4 99 2
4 100 1
5 101 1

Now i want to display just those columns which as RANK=1, there should not be rank 2 or 3
that is result should be like this
COLA COLB RANK
1 91 1
2 92 1
5 101 1

thanks in adv.

3 01 2014

You will need to put your current SQL statement into an inline view, and then add a WHERE clause outside/after the inline view. There is an example of this approach on this web page, in the second and third SELECT statements. The second SELECT statement begins like this:

```SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
```

I then put that SELECT statement into an inline view and added a WHERE clause outside/after the inline view. The third SELECT statement begins like this:

```SELECT
SUM(T.C1) S
FROM
(SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
```