发表新帖

发表新帖

Group by index + column in pandas

后端未结

关注

 4  439

北荒 2021-02-02 05:27

I have a dataframe that has the columns

user_id
item_bought

Here user_id is the index of the df. I want to group by both user_id and item_b

4条回答

故里飘歌 (楼主)

2021-02-02 06:11
I had the same problem - imported a bunch of data and I wanted to groupby a field that was the index. I didn't have a multi-index or any of that jazz and nor do you.

I figured the problem is that the field I want is the index, so at first I just reset the index - but this gives me a useless index field that I don't want. So now I do the following (two levels of grouping):
```
grouped = df.reset_index().groupby(by=['Field1','Field2'])
```
then I can use 'grouped' in a bunch of ways for different reports
```
grouped[['Field3','Field4']].agg([np.mean, np.std])
```
(which was what I wanted, giving me Field4 and Field3 averages, grouped by Field1 (the index) and Field2

For you, if you just want to do the count of items per user, in one simple line using groupby, the code could be
```
df.reset_index().groupby(by=['user_id']).count()
```
If you want to do more things then you can (like me) create 'grouped' and then use that. As a beginner, I find it easier to follow that way.

Please note, that the "reset_index" is not 'in place' and so will not mess up your original dataframe
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题