Group by index + column in pandas

后端 未结 4 439
北荒
北荒 2021-02-02 05:27

I have a dataframe that has the columns

  1. user_id
  2. item_bought

Here user_id is the index of the df. I want to group by both user_id and item_b

4条回答
  •  故里飘歌
    2021-02-02 06:11

    I had the same problem - imported a bunch of data and I wanted to groupby a field that was the index. I didn't have a multi-index or any of that jazz and nor do you.

    I figured the problem is that the field I want is the index, so at first I just reset the index - but this gives me a useless index field that I don't want. So now I do the following (two levels of grouping):

    grouped = df.reset_index().groupby(by=['Field1','Field2'])
    

    then I can use 'grouped' in a bunch of ways for different reports

    grouped[['Field3','Field4']].agg([np.mean, np.std])
    

    (which was what I wanted, giving me Field4 and Field3 averages, grouped by Field1 (the index) and Field2

    For you, if you just want to do the count of items per user, in one simple line using groupby, the code could be

    df.reset_index().groupby(by=['user_id']).count()
    

    If you want to do more things then you can (like me) create 'grouped' and then use that. As a beginner, I find it easier to follow that way.

    Please note, that the "reset_index" is not 'in place' and so will not mess up your original dataframe

提交回复
热议问题