Pandas transform() vs apply()

后端 未结 2 1790

I don\'t understand why apply and transform return different dtypes when called on the same data frame. The way I explained the two functions to my

2条回答
  •  离开以前
    2021-01-04 02:29

    Just adding another illustrative example with sum as I find it more explicit:

    df = (
        pd.DataFrame(pd.np.random.rand(10, 3), columns=['a', 'b', 'c'])
            .assign(a=lambda df: df.a > 0.5)
    )
    
    Out[70]: 
           a         b         c
    0  False  0.126448  0.487302
    1  False  0.615451  0.735246
    2  False  0.314604  0.585689
    3  False  0.442784  0.626908
    4  False  0.706729  0.508398
    5  False  0.847688  0.300392
    6  False  0.596089  0.414652
    7  False  0.039695  0.965996
    8   True  0.489024  0.161974
    9  False  0.928978  0.332414
    
    df.groupby('a').apply(sum)  # drop rows
    
             a         b         c
    a                             
    False  0.0  4.618465  4.956997
    True   1.0  0.489024  0.161974
    
    
    df.groupby('a').transform(sum)  # keep dims
    
              b         c
    0  4.618465  4.956997
    1  4.618465  4.956997
    2  4.618465  4.956997
    3  4.618465  4.956997
    4  4.618465  4.956997
    5  4.618465  4.956997
    6  4.618465  4.956997
    7  4.618465  4.956997
    8  0.489024  0.161974
    9  4.618465  4.956997
    

    However when applied to pd.DataFrame and not pd.GroupBy object I was not able to see any difference.

提交回复
热议问题