问题
My dataframe look like this
Plate Sample LogRatio
P1 S1 0.42
P1 S2 0.23
P2 S3 0.41
P3 S4 0.36
P3 S5 0.18
I have calculated the median of each plate (but it's probably not the best idea to start like this)
grouped = df.groupby("Plate")
medianesPlate = grouped["LogRatio"].median()
And I want to add a column on my dataframe
CorrectedLogRatio = LogRatio-median(plate)
I suppose with :
df["CorrectedLogRatio"] = LogRatio-median(plate)
To have something like this :
Plate Sample LogRatio CorrectedLogRatio
P1 S1 0.42 0.42-median(P1)
P1 S2 0.23 0.23-median(P1)
P2 S3 0.41 0.41-median(P2)
P3 S4 0.36 0.36-median(P3)
P3 S5 0.18 0.18-median(P3)
But I don't know how to get the median from medianesPlates. I tried some apply and transform functions but it doesn't work. Thanks for any help
回答1:
You can use transform:
df['CorrectedLogRatio'] = df['LogRatio'] - df.groupby('Plate')['LogRatio'].transform('median')
The resulting output:
Plate Sample LogRatio CorrectedLogRatio
0 P1 S1 0.42 0.095
1 P1 S2 0.23 -0.095
2 P2 S3 0.41 0.000
3 P3 S4 0.36 0.090
4 P3 S5 0.18 -0.090
来源:https://stackoverflow.com/questions/40532303/pandas-groupby-and-correct-with-median-in-new-column