Count distinct in window functions

天涯浪子 提交于 2021-02-10 20:26:49

问题


I was trying to count of unique column b for each c, with out doing group by. I know this could be done with join. how to do count(distinct b) over (partition by c) with out resorting to join. Why are count distinct not supported in window functions. Thank you in advance. Given this data frame:

val df= Seq(("a1","b1","c1"),
                ("a2","b2","c1"),
                ("a3","b3","c1"),
                ("a31",null,"c1"),
                ("a32",null,"c1"),
                ("a4","b4","c11"),
                ("a5","b5","c11"),
                ("a6","b6","c11"),
                ("a7","b1","c2"),
                ("a8","b1","c3"),
                ("a9","b1","c4"),
                ("a91","b1","c5"),
                ("a92","b1","c5"),
                ("a93","b1","c5"),
                ("a95","b2","c6"),
                ("a96","b2","c6"),
                ("a97","b1","c6"),
                ("a977",null,"c6"),
                ("a98",null,"c8"),
                ("a99",null,"c8"),
                ("a999",null,"c8")
                ).toDF("a","b","c");

回答1:


Some databases do support count(distinct) as a window function. There are two alternatives. One is the sum of dense ranks:

select (dense_rank() over (partition by c order by b asc) +
        dense_rank() over (partition by c order by b desc) -
        1
       ) as count_distinct
from t;

The second uses a subquery:

select sum(case when seqnum = 1 then 1 else 0 end) over (partition by c)
from (select t.*, row_number() over (partition by c order by b) as seqnum
      from t
     ) t;



回答2:


count of unique column b for each c without doing group by.

A typical SQL workaround is to use a subquery that selects distincts tuples, and then a window count in the outer query:

SELECT c, COUNT(*) OVER(PARTITION BY c) cnt
FROM (SELECT DISTINCT b, c FROM mytable) x


来源:https://stackoverflow.com/questions/58349076/count-distinct-in-window-functions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!