Filtering rows by a particular index level in a MultiIndex dataframe

允我心安 提交于 2019-12-23 17:02:16

问题


Given a multIndex dataframe:

mux = pd.MultiIndex.from_arrays([
    list('aaaabbbbbccdddddd'),
    list('tuvwlmnopxyfghijk')
], names=['one', 'two'])

df = pd.DataFrame({'col': np.arange(len(mux))}, mux)

df

         col
one two     
a   t      0
    u      1
    v      2
    w      3
b   l      4
    m      5
    n      6
    o      7
    p      8
c   x      9
    y     10
d   f     11
    g     12
    h     13
    i     14
    j     15
    k     16

Is it possible to keep rows corresponding to upto the ith value of the 0th level of the dataframe?

For i = 2, my expected output is:

         col
one two     
a   t      0
    u      1
    v      2
    w      3
b   l      4
    m      5
    n      6
    o      7
    p      8

Note that only rows pertaining to a and b are retained, everything else is dropped. I hope the problem is clear, but if it isn't, please feel free to ask for clarifications.

I tried:

idx = pd.IndexSlice
df.iloc[(idx[:2], slice(None))]

But that gives me only the first two rows in the entire df, not all rows of the first two values in the 0th level.


回答1:


One way to go about this is to return the index values for the 0th level and then index into the original data frame with those:

df.loc[df.index.levels[0][:2].values]

         col
one two     
a   t      0
    u      1
    v      2
    w      3
b   l      4
    m      5
    n      6
    o      7
    p      8

As mentioned in the comments this only works for the 0th level and not the 1st. There may be a more a generalizable solution that would work with other levels.



来源:https://stackoverflow.com/questions/46900483/filtering-rows-by-a-particular-index-level-in-a-multiindex-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!