The difference between countDistinct and distinct.count

前端 未结 3 344
借酒劲吻你
借酒劲吻你 2021-01-15 20:09

Why do I get different outputs for ..agg(countDistinct(\"member_id\") as \"count\") and ..distinct.count? Is the difference the same as between

3条回答
  •  一整个雨季
    2021-01-15 21:05

    1st command :

    DF.agg(countDistinct("member_id") as "count")
    

    return the same as that of select count distinct(member_id) from DF.

    2nd command :

    DF.distinct.count
    

    is actually getting distinct records or removing al duplicates from the DF and then taking the count.

提交回复
热议问题