Group by index + column in pandas

后端 未结 4 438
北荒
北荒 2021-02-02 05:27

I have a dataframe that has the columns

  1. user_id
  2. item_bought

Here user_id is the index of the df. I want to group by both user_id and item_b

4条回答
  •  故里飘歌
    2021-02-02 06:13

    From version 0.20.1 it is simplier:

    Strings passed to DataFrame.groupby() as the by parameter may now reference either column names or index level names

    arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
              ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
    
    index = pd.MultiIndex.from_arrays(arrays, names=['first', 'second'])
    
    df = pd.DataFrame({'A': [1, 1, 1, 1, 2, 2, 3, 3],
                       'B': np.arange(8)}, index=index)
    
    print (df)
    
                  A  B
    first second      
    bar   one     1  0
          two     1  1
    baz   one     1  2
          two     1  3
    foo   one     2  4
          two     2  5
    qux   one     3  6
          two     3  7
    
    print (df.groupby(['second', 'A']).sum())
              B
    second A   
    one    1  2
           2  4
           3  6
    two    1  4
           2  5
           3  7
    

提交回复
热议问题