nth percentile calculations in postgresql

后端 未结 2 683
野趣味
野趣味 2020-12-06 09:22

I\'ve been surprisingly unable to find an nth percentile function for postgresql.

I am using this via mondrian olap tool so i just need an aggregate function which r

相关标签:
2条回答
  • 2020-12-06 09:58

    The ntile function is very useful here. I have a table test_temp:

    select * from test_temp
    
    score
    integer
    3
    5
    2
    10
    4
    8
    7
    12
    
    select score, ntile(4) over (order by score) as quartile from test_temp;
    
    score    quartile
    integer  integer
    2        1
    3        1
    4        2
    5        2
    7        3
    8        3
    10       4
    12       4
    

    ntile(4) over (order by score) orders the columns by score, splits it into four even groups (if the number divides evenly) and assigns the group number based on the order.

    Since I have 8 numbers here, they represent the 0th, 12.5th, 25th, 37.5th, 50th, 62.5th, 75th and 87.5th percentiles. So if I only take the results where the quartile is 2, I'll have the 25th and 37.5th percentiles.

    with ranked_test as (
        select score, ntile(4) over (order by score) as quartile from temp_test
    )
    select min(score) from ranked_test
    where quartile = 2
    group by quartile;
    

    returns 4, the third highest number on the list of 8.

    If you had a larger table and used ntile(100) the column you filter on would be the percentile, and you could use the same query as above.

    0 讨论(0)
  • 2020-12-06 10:06

    With PostgreSQL 9.4 there is native support for percentiles now, implemented in Ordered-Set Aggregate Functions:

    percentile_cont(fraction) WITHIN GROUP (ORDER BY sort_expression) 
    

    continuous percentile: returns a value corresponding to the specified fraction in the ordering, interpolating between adjacent input items if needed

    percentile_cont(fractions) WITHIN GROUP (ORDER BY sort_expression)
    

    multiple continuous percentile: returns an array of results matching the shape of the fractions parameter, with each non-null element replaced by the value corresponding to that percentile

    See the documentation for more details: http://www.postgresql.org/docs/current/static/functions-aggregate.html

    and see here for some examples: https://github.com/michaelpq/michaelpq.github.io/blob/master/_posts/2014-02-27-postgres-9-4-feature-highlight-within-group.markdown

    CREATE TABLE aa AS SELECT generate_series(1,20) AS a;
    --SELECT 20
    
    WITH subset AS (
        SELECT a AS val,
            ntile(4) OVER (ORDER BY a) AS tile
        FROM aa
    )
    SELECT tile, max(val)
    FROM subset GROUP BY tile ORDER BY tile;
    
     tile | max
    ------+-----
        1 |   5
        2 |  10
        3 |  15
        4 |  20
    (4 rows)
    
    0 讨论(0)
提交回复
热议问题