问题
I have a DataFrame with MultiIndex like this:
In [5]: df
Out[5]:
a b
lvl0 lvl1 lvl2
A0 B0 C0 0 1
C1 2 3
C2 4 5
C3 6 7
B1 C0 8 9
C1 10 11
C2 12 13
C3 14 15
A1 B0 C0 16 47
C1 18 49
C2 20 41
C3 22 43
B1 C0 24 25
C1 26 27
C2 28 29
C3 30 31
A2 B0 C0 32 33
C1 34 35
C2 36 37
C3 38 39
B1 C0 40 41
C1 42 43
C2 44 45
C3 46 47
I want get the special lvl1 group in each lvl0 index. In this case, choice the group where column b has max value, result may like this:
a b
lvl0 lvl1 lvl2
A0 B1 C0 8 9
C1 10 11
C2 12 13
C3 14 15
A1 B0 C0 16 47
C1 18 49
C2 20 41
C3 22 43
A2 B1 C0 40 41
C1 42 43
C2 44 45
C3 46 47
Is there a indexing method like df[(('A0','B1'),('A1','B0'),('A2','B1')),:]
? I have try my best, Thanks for any help.
回答1:
You can use:
df1 = df.reset_index(level=2, drop=True)
mask = df1.index.isin(df1.groupby(level=[0])['b'].idxmax())
df = df[mask]
print (df)
a b
lvl0 lvl1 lvl2
A0 B1 C0 8 9
C1 10 11
C2 12 13
C3 14 15
A1 B0 C0 16 47
C1 18 49
C2 20 41
C3 22 43
A2 B1 C0 40 41
C1 42 43
C2 44 45
C3 46 47
Explanation:
First remove 3 level of MultiIndex
by reset_index and groupby with idxmax for indices of max values in column b
:
df1 = df.reset_index(level=2, drop=True)
idx = df1.groupby(level=[0])['b'].idxmax()
print (idx)
lvl0
A0 (A0, B1)
A1 (A1, B0)
A2 (A2, B1)
Name: b, dtype: object
Then create boolean mask by compare by isin:
print (df1.index.isin(idx))
[False False False False True True True True True True True True
False False False False False False False False True True True True]
and last filter by boolean indexing:
df = df[df1.index.isin(idx)]
print (df)
a b
lvl0 lvl1 lvl2
A0 B1 C0 8 9
C1 10 11
C2 12 13
C3 14 15
A1 B0 C0 16 47
C1 18 49
C2 20 41
C3 22 43
A2 B1 C0 40 41
C1 42 43
C2 44 45
C3 46 47
来源:https://stackoverflow.com/questions/48090565/get-special-group-in-pandas-multiindex