问题
Given a multIndex dataframe:
mux = pd.MultiIndex.from_arrays([
list('aaaabbbbbccdddddd'),
list('tuvwlmnopxyfghijk')
], names=['one', 'two'])
df = pd.DataFrame({'col': np.arange(len(mux))}, mux)
df
col
one two
a t 0
u 1
v 2
w 3
b l 4
m 5
n 6
o 7
p 8
c x 9
y 10
d f 11
g 12
h 13
i 14
j 15
k 16
Is it possible to keep rows corresponding to upto the ith value of the 0th level of the dataframe?
For i = 2, my expected output is:
col
one two
a t 0
u 1
v 2
w 3
b l 4
m 5
n 6
o 7
p 8
Note that only rows pertaining to a and b are retained, everything else is dropped. I hope the problem is clear, but if it isn't, please feel free to ask for clarifications.
I tried:
idx = pd.IndexSlice
df.iloc[(idx[:2], slice(None))]
But that gives me only the first two rows in the entire df, not all rows of the first two values in the 0th level.
回答1:
One way to go about this is to return the index values for the 0th level and then index into the original data frame with those:
df.loc[df.index.levels[0][:2].values]
col
one two
a t 0
u 1
v 2
w 3
b l 4
m 5
n 6
o 7
p 8
As mentioned in the comments this only works for the 0th level and not the 1st. There may be a more a generalizable solution that would work with other levels.
来源:https://stackoverflow.com/questions/46900483/filtering-rows-by-a-particular-index-level-in-a-multiindex-dataframe