SQL – DENSE_RANK, PERCENT_RANK, and COUNT Analytical Functions

26 12 2009

December 26, 2009

A couple years ago the following question appeared on the comp.databases.oracle.misc Usenet group:
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/9af4117466316d9a

I have to following problem:

I get X rows from a statement, these are sorted by a certain column, let’s say a numerical value.
Now I want to calculate the average of this numerical value, but the 10% with the lowest and the 10% with the highest value shall not be included in this calculation. So for example, if I get 20 rows, I need the average of the value in rows 3 to 18.

Currently I solved this with a very complicated statement, but I don’t know the built-in Oracle mathematical functions so I hope that there could be a way to do this with a better performance.

Let’s set up a short experiment:

CREATE TABLE T1 (C1 NUMBER(4));

INSERT INTO T1
SELECT
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=20;

We now have a table with 20 rows with numbers between 1 and 20.

Assuming that you are running a version of Oracle that supports analytical functions, the following returns the twenty rows with the relative ranking of each row, if the rows are sorted by C1 in descending order:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
  COUNT(C1) OVER (PARTITION BY 1) R
FROM
  T1;

        C1         DR          R
---------- ---------- ----------
        20          1         20
        19          2         20
        18          3         20
        17          4         20
        16          5         20
        15          6         20
        14          7         20
        13          8         20
        12          9         20
        11         10         20
        10         11         20
         9         12         20
         8         13         20
         7         14         20
         6         15         20
         5         16         20
         4         17         20
         3         18         20
         2         19         20
         1         20         20

A slight modification of the above, dividing the value of DENSE_RANK by the value of COUNT, and also including a PERCENT_RANK for comparison:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
  (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
  PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
  T1;

        C1         DR        DRP         PR
---------- ---------- ---------- ----------
        20          1        .05          0
        19          2         .1 .052631579
        18          3        .15 .105263158
        17          4         .2 .157894737
        16          5        .25 .210526316
        15          6         .3 .263157895
        14          7        .35 .315789474
        13          8         .4 .368421053
        12          9        .45 .421052632
        11         10         .5 .473684211
        10         11        .55 .526315789
         9         12         .6 .578947368
         8         13        .65 .631578947
         7         14         .7 .684210526
         6         15        .75 .736842105
         5         16         .8 .789473684
         4         17        .85 .842105263
         3         18         .9 .894736842
         2         19        .95 .947368421
         1         20          1          1

The final cleanup is performed when the above is slid into an inline view, by using a WHERE clause:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
    (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
    PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
  FROM
    T1) T
WHERE
  T.DRP>0.1
  AND T.DRP<=0.9;

S
———-
168

A version that uses the PERCENT_RANK value:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
    (DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY 1)) DRP,
    PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
  FROM
    T1) T
WHERE
  T.PR BETWEEN 0.1 AND 0.9;

S
———-
168

Actions

Information

Date : December 26, 2009
Categories : Analytic Functions, SQL

2 responses

3 01 2014

Irshad (06:24:25) :

who to use DENSE_RANK() for the following table
Considering
COLA COLB RANK
1 91 1
2 92 1
3 93 6
3 94 5
3 95 4
3 96 3
3 97 2
3 98 1
4 99 2
4 100 1
5 101 1

Now i want to display just those columns which as RANK=1, there should not be rank 2 or 3
that is result should be like this
COLA COLB RANK
1 91 1
2 92 1
5 101 1

thanks in adv.

3 01 2014

Charles Hooper (08:17:35) :

You will need to put your current SQL statement into an inline view, and then add a WHERE clause outside/after the inline view. There is an example of this approach on this web page, in the second and third SELECT statements. The second SELECT statement begins like this:

SELECT
  C1,
  DENSE_RANK() OVER (ORDER BY C1 DESC) DR,

I then put that SELECT statement into an inline view and added a WHERE clause outside/after the inline view. The third SELECT statement begins like this:

SELECT
  SUM(T.C1) S
FROM
  (SELECT
    C1,
    DENSE_RANK() OVER (ORDER BY C1 DESC) DR,

	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Favorite Free Applications to…
	Zach on Favorite Free Applications to…
	Charles Hooper on Huge Presentation – Work…
	Dan on Huge Presentation – Work…
	Charles Hooper on Review of Synology DS415+, How…
	Matthias Clausen on Review of Synology DS415+, How…
	Charles Hooper on _SMALL_TABLE_THRESHOLD Paramet…
	mohd atif on _SMALL_TABLE_THRESHOLD Paramet…
	Charles Hooper on Free ANSI SQL to Oracle Specif…
	Kalin Agrawal on Free ANSI SQL to Oracle Specif…
	Charles Hooper on Large Page Support or Not…
	vaurob on Large Page Support or Not…
	Locating Oracle… on Finding a New Home for a Clien…

Charles Hooper's Oracle Notes