SQL – DENSE_RANK, PERCENT_RANK, and COUNT Analytical Functions

26 12 2009

December 26, 2009

A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/9af4117466316d9a

I have to following problem:

I get X rows from a statement, these are sorted by a certain column, let’s say a numerical value.
Now I want to calculate the average of this numerical value, but the 10% with the lowest and the 10% with the highest value shall not be included in this calculation. So for example, if I get 20 rows, I need the average of the value in rows 3 to 18.

Currently I solved this with a very complicated statement, but I don’t know the built-in Oracle mathematical functions so I hope that there could be a way to do this with a better performance.

Let’s set up a short experiment:

CREATE TABLE T1 (C1 NUMBER(4));

INSERT INTO T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=20;

We now have a table with 20 rows with numbers between 1 and 20.

Assuming that you are running a version of Oracle that supports analytical functions, the following returns the twenty rows with the relative ranking of each row, if the rows are sorted by C1 in descending order:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
  COUNT(C1) OVER (PARTITION BY 1) R
FROM
  T1;

        C1         DR          R
---------- ---------- ----------
        20          1         20
        19          2         20
        18          3         20
        17          4         20
        16          5         20
        15          6         20
        14          7         20
        13          8         20
        12          9         20
        11         10         20
        10         11         20
         9         12         20
         8         13         20
         7         14         20
         6         15         20
         5         16         20
         4         17         20
         3         18         20
         2         19         20
         1         20         20

A slight modification of the above, dividing the value of DENSE_RANK by the value of COUNT, and also including a PERCENT_RANK for comparison:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
  (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
  PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
  T1;

        C1         DR        DRP         PR
---------- ---------- ---------- ----------
        20          1        .05          0
        19          2         .1 .052631579
        18          3        .15 .105263158
        17          4         .2 .157894737
        16          5        .25 .210526316
        15          6         .3 .263157895
        14          7        .35 .315789474
        13          8         .4 .368421053
        12          9        .45 .421052632
        11         10         .5 .473684211
        10         11        .55 .526315789
         9         12         .6 .578947368
         8         13        .65 .631578947
         7         14         .7 .684210526
         6         15        .75 .736842105
         5         16         .8 .789473684
         4         17        .85 .842105263
         3         18         .9 .894736842
         2         19        .95 .947368421
         1         20          1          1

The final cleanup is performed when the above is slid into an inline view, by using a WHERE clause:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
    (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
    PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
  FROM
    T1) T
WHERE
  T.DRP>0.1
  AND T.DRP<=0.9;

         S
———-
       168

A version that uses the PERCENT_RANK value:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
    (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
    PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
  FROM
    T1) T
WHERE
  T.PR BETWEEN 0.1 AND 0.9;

         S
———-
       168


Actions

Information

2 responses

3 01 2014
Irshad

who to use DENSE_RANK() for the following table
Considering
COLA COLB RANK
1 91 1
2 92 1
3 93 6
3 94 5
3 95 4
3 96 3
3 97 2
3 98 1
4 99 2
4 100 1
5 101 1

Now i want to display just those columns which as RANK=1, there should not be rank 2 or 3
that is result should be like this
COLA COLB RANK
1 91 1
2 92 1
5 101 1

thanks in adv.

3 01 2014
Charles Hooper

You will need to put your current SQL statement into an inline view, and then add a WHERE clause outside/after the inline view. There is an example of this approach on this web page, in the second and third SELECT statements. The second SELECT statement begins like this:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,

I then put that SELECT statement into an inline view and added a WHERE clause outside/after the inline view. The third SELECT statement begins like this:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

Join 137 other followers

%d bloggers like this: