df.groupby(…).agg(set) produces different result compared to df.groupby(…).agg(lambda x: set(x))

后端 未结 2 1189
谎友^
谎友^ 2021-02-07 07:43

Answering this question it turned out that df.groupby(...).agg(set) and df.groupby(...).agg(lambda x: set(x)) are producing different results.

2条回答
  •  情歌与酒
    2021-02-07 08:47

    Perhaps as @Edchum commented agg applies the python builtin functions considering the groupby object as a mini dataframe, whereas when a defined function is passed it applies it for every column. An example to illustrate this is via print.

    df.groupby('user_id').agg(print,end='\n\n')
    
     class_type instructor  user_id
    0  Krav Maga        Bob        1
    4   Ju-jitsu      Alice        1
    
      class_type instructor  user_id
    1       Yoga      Alice        2
    5  Krav Maga      Alice        2
    
      class_type instructor  user_id
    2   Ju-jitsu        Bob        3
    6     Karate        Bob        3
    
    
    df.groupby('user_id').agg(lambda x : print(x,end='\n\n'))
    
    0    Krav Maga
    4     Ju-jitsu
    Name: class_type, dtype: object
    
    1         Yoga
    5    Krav Maga
    Name: class_type, dtype: object
    
    2    Ju-jitsu
    6      Karate
    Name: class_type, dtype: object
    
    3    Krav Maga
    Name: class_type, dtype: object
    
    ...
    

    Hope this is the reason why applying set gave the result like the one mentioned above.

提交回复
热议问题