How to output different 25th, 50th, 75th percentiles in single Teradata query?

。_饼干妹妹 提交于 2019-12-11 08:38:08

问题


I had got stuck few hours back on around something similar and worked out a less messy code for outputting 25th, 50th, 75th percentiles in a single Teradata query. Can be further extended to produce a "5 point summary". For minimum and maximum change static values according to your population estimate.

Somewhere someone had asked for an elegant approach. Sharing mine.

Here's the code:

SELECT MAX(PER_MIN) AS PER_MIN,
       MAX(PER_25) AS PER_25,
       MAX(PER_50)  AS PER_50,
       MAX(PER_75)  AS PER_75,
       MAX(PER_MAX) AS PER_MAX
FROM (SELECT CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.01 AS INT) THEN DURATION_MACRO_CURR END AS PER_MIN,
             CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.25 AS INT) THEN DURATION_MACRO_CURR END AS PER_25,
             CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.50 AS INT) THEN DURATION_MACRO_CURR END AS PER_50
             CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.75 AS INT) THEN DURATION_MACRO_CURR END AS PER_75
             CASE WHEN ROW_NUMBER() OVER(ORDER BY DURATION_MACRO_CURR ASC) = CAST(COUNT(*) OVER() * 0.99 AS INT) THEN DURATION_MACRO_CURR END AS PER_MAX
      FROM PROD_EXP_DL_CVM.PROD_CVM
      WHERE PW_END_DATE =  '2016-10-18'
    ) BASE

Here's the desired output:


回答1:


I would do this using conditional aggregation:

select min(DURATION_MACRO_CURR) as min_val,
       min(case when seqnum / 0.25 >= cnt then DURATION_MACRO_CURR end) as 25_percentile,
       min(case when seqnum / 0.50 >= cnt then DURATION_MACRO_CURR end) as 50_percentile,
       min(case when seqnum / 0.75 >= cnt then DURATION_MACRO_CURR end) as 75_percentile,
       max(DURATION_MACRO_CURR) as max_val
from (select pc.*,
             row_number() over (order by DURATION_MACRO_CURR) as seqnum,
             count(*) over () as cnt
      from PROD_EXP_DL_CVM.PROD_CVM pc
      where pc.PW_END_DATE =  '2016-10-18'
     ) pc;


来源:https://stackoverflow.com/questions/41703865/how-to-output-different-25th-50th-75th-percentiles-in-single-teradata-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!