Selecting rows from a Pandas dataframe with a compound (hierarchical) index

后端 未结 3 989
[愿得一人]
[愿得一人] 2020-12-07 12:05

I\'m suspicious that this is trivial, but I yet to discover the incantation that will let me select rows from a Pandas dataframe based on the values of a hierarchical key. S

相关标签:
3条回答
  • 2020-12-07 12:08

    In Python 0.19.0 there is a new suggested approach, which is explained here1. I believe the clearest example they give is the following, in which they slice from a four-level indexing. This is how the dataframe is made:

    In [46]: def mklbl(prefix,n):
       ....:     return ["%s%s" % (prefix,i)  for i in range(n)]
       ....: 
    
    In [47]: miindex = pd.MultiIndex.from_product([mklbl('A',4),
       ....:                                       mklbl('B',2),
       ....:                                       mklbl('C',4),
       ....:                                       mklbl('D',2)])
       ....: 
    
    In [48]: micolumns = pd.MultiIndex.from_tuples([('a','foo'),('a','bar'),
       ....:                                        ('b','foo'),('b','bah')],
       ....:                                       names=['lvl0', 'lvl1'])
       ....: 
    
    In [49]: dfmi = pd.DataFrame(np.arange(len(miindex)*len(micolumns)).reshape((len(miindex),len(micolumns))),
       ....:                     index=miindex,
       ....:                     columns=micolumns).sort_index().sort_index(axis=1)
       ....: 
    
    In [50]: dfmi
    Out[50]: 
    lvl0           a         b     
    lvl1         bar  foo  bah  foo
    A0 B0 C0 D0    1    0    3    2
             D1    5    4    7    6
          C1 D0    9    8   11   10
             D1   13   12   15   14
          C2 D0   17   16   19   18
             D1   21   20   23   22
          C3 D0   25   24   27   26
    ...          ...  ...  ...  ...
    A3 B1 C0 D1  229  228  231  230
          C1 D0  233  232  235  234
             D1  237  236  239  238
          C2 D0  241  240  243  242
             D1  245  244  247  246
          C3 D0  249  248  251  250
             D1  253  252  255  254
    

    And this is how they select the different rows:

    In [51]: dfmi.loc[(slice('A1','A3'),slice(None), ['C1','C3']),:]
    Out[51]: 
    lvl0           a         b     
    lvl1         bar  foo  bah  foo
    A1 B0 C1 D0   73   72   75   74
             D1   77   76   79   78
          C3 D0   89   88   91   90
             D1   93   92   95   94
       B1 C1 D0  105  104  107  106
             D1  109  108  111  110
          C3 D0  121  120  123  122
    ...          ...  ...  ...  ...
    A3 B0 C1 D1  205  204  207  206
          C3 D0  217  216  219  218
             D1  221  220  223  222
       B1 C1 D0  233  232  235  234
             D1  237  236  239  238
          C3 D0  249  248  251  250
             D1  253  252  255  254
    

    So quite simply, in df.loc[(indices),:], you specify the indices that you want to select per level, from highest level to lowest. If you do not want to make a selection of the lowest level(s) of indices, you can omit specifying them. If you don't want to make a slice between other specified levels, you add slice(None). Both cases are shown in the example, where level D is omitted and level B is specified between A and C.

    0 讨论(0)
  • 2020-12-07 12:21

    Syntax like the following will work:

    df.ix['a']
    df.ix['a'].ix['c']
    

    since group1 and group2 are indices. Please forgive my previous attempt!

    To get at the second index only, I think you have to swap indices:

    df.swaplevel(0,1).ix['c']
    

    But I'm sure Wes will correct me if I'm wrong.

    0 讨论(0)
  • 2020-12-07 12:32

    Try using xs to be very precise:

    In [5]: df.xs('a', level=0)
    Out[5]: 
            value1  value2
    group2                
    c          1.1     7.1
    c          2.0     8.0
    d          3.0     9.0
    
    In [6]: df.xs('c', level='group2')
    Out[6]: 
            value1  value2
    group1                
    a          1.1     7.1
    a          2.0     8.0
    
    0 讨论(0)
提交回复
热议问题