问题
I am struggling with MultiIndex DataFrame in python pandas.
Suppose I have a df like this:
count day
group name
A Anna 10 Monday
Beatrice 15 Tuesday
B Beatrice 15 Wednesday
Cecilia 20 Thursday
What I need is to find the maximum in name for each group and remove it from the dataframe.
The final df would look like:
count day
group name
A Anna 10 Monday
B Beatrice 15 Wednesday
Does any of you have any idea how to do this? I am running out of ideas...
Thanks in advance!
EDIT:
What if the original dataframe is:
count day
group name
A Anna 10 Monday
Beatrice 15 Tuesday
B Beatrice 20 Wednesday
Cecilia 15 Thursday
and the final df needs to be:
count day
group name
A Anna 10 Monday
B Beatrice 20 Wednesday
回答1:
UPDATE:
In [386]: idx = (df.reset_index('name')
.groupby('group')['name']
.max()
.reset_index()
.values.tolist())
In [387]: df.loc[df.index.difference(idx)]
Out[387]:
count day
group name
A Anna 10 Monday
B Beatrice 20 Wednesday
In [326]: df.loc[df.index.difference(df.groupby('group')['count'].idxmax())]
Out[326]:
count day
group name
A Anna 10 Monday
B Beatrice 15 Wednesday
PS most probably there is a better way to do this...
来源:https://stackoverflow.com/questions/49669129/python-multiindex-dataframe-remove-maximum