Adding a grouped, aggregate nunique column to pandas dataframe

后端 未结 1 1813
暖寄归人
暖寄归人 2020-12-19 12:27

I want to add an aggregate, grouped, nunique column to my pandas dataframe but not aggregate the entire dataframe. I\'m trying to do this in one line and avoid creating a ne

相关标签:
1条回答
  • 2020-12-19 13:05
    df.groupby(['track', 'type'])['id'].transform(nunique)
    

    Implies that there is a name nunique in the name space that performs some function. transform will take a function or a string that it knows a function for. nunique is definitely one of those strings.

    As pointed out by @root, often the method that pandas will utilize to perform a transformation indicated by these strings are optimized and should generally be preferred to passing your own functions. This is True even for passing numpy functions in some cases.

    For example transform('sum') should be preferred over transform(sum).

    Try this instead

    df.groupby(['track', 'type'])['id'].transform('nunique')
    

    demo

    df = pd.DataFrame(dict(
        track=list('11112222'), type=list('AAAABBBB'), id=list('XXYZWWWW')))
    print(df)
    
      id track type
    0  X     1    A
    1  X     1    A
    2  Y     1    A
    3  Z     1    A
    4  W     2    B
    5  W     2    B
    6  W     2    B
    7  W     2    B
    
    df.groupby(['track', 'type'])['id'].transform('nunique')
    
    0    3
    1    3
    2    3
    3    3
    4    1
    5    1
    6    1
    7    1
    Name: id, dtype: int64
    
    0 讨论(0)
提交回复
热议问题