select count distinct using pig latin

前端 未结 3 1690
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-13 01:53

I need help with this pig script. I am just getting a single record. I am selecting 2 columns and doing a count(distinct) on another while also using a where like clause to find

3条回答
  •  渐次进展
    2021-02-13 02:37

    You can better define this as a macro:

    DEFINE DISTINCT_COUNT(A, c) RETURNS dist {
      temp = FOREACH $A GENERATE $c;                                                                                                                                                      
      dist = DISTINCT temp;                                                                                                                                                               
      groupAll = GROUP dist ALL;                                                                                                                                                          
      $dist = FOREACH groupAll GENERATE COUNT(dist);                                                                                                                                      
    }
    

    Usage:

    X = LOAD 'data' AS (x: int);

    Y = DISTINCT_COUNT(X, x);

    If you need to use it in a FOREACH instead then the easiest way is something like:

    ...GENERATE COUNT(Distinct(x))...

    Tested on Pig 12.

提交回复
热议问题