adding a grouped-by zscore column to a pandas dataframe

后端 未结 1 360
北恋
北恋 2021-01-13 07:14

I can insert a column into a dataframe that z-scores another column like this:

[1] df.insert(, column=\'ZofA\', value=(df[\'A\']-df[\'A\'].mean())         


        
相关标签:
1条回答
  • 2021-01-13 07:51

    Thanks for the pointer to the documentation. For any who are curious, I thought I'd post the solution. First, put the zscore calculation into a lambda:

    zscore = lambda x: (x - x.mean()) / x.std()
    

    The magic ingredient is .transform. Just write the insert statement like this:

    df.insert(<loc>, 'ZofA', df.groupby(['C1', 'C2'])['A'].transform(zscore))
    

    The solution is indeed in the "Group By: split-apply-combine" document. You just have to scroll down about halfway to the "Transformation" section. I ignored the stuff about the date key and just plugged my grouping columns directly into the groupby statement.

    0 讨论(0)
提交回复
热议问题