How to keep original index of a DataFrame after groupby 2 columns?

后端 未结 3 1174
时光说笑
时光说笑 2021-02-13 03:26

Is there any way I can retain the original index of my large dataframe after I perform a groupby? The reason I need to this is because I need to do an inner merge back to my ori

3条回答
  •  梦毁少年i
    2021-02-13 04:05

    You can elevate your index to a column via reset_index. Then aggregate your index to a tuple via agg, together with your count aggregation.

    Below is a minimal example.

    import pandas as pd, numpy as np
    
    df = pd.DataFrame(np.random.randint(0, 4, (50, 5)),
                      index=np.random.randint(0, 4, 50))
    
    df = df.reset_index()
    
    res = df.groupby([0, 1]).agg({2: 'count', 'index': lambda x: tuple(x)}).reset_index()
    
    #     0  1  2            index
    # 0   0  0  4     (2, 0, 0, 2)
    # 1   0  1  4     (0, 3, 1, 1)
    # 2   0  2  1             (1,)
    # 3   0  3  1             (3,)
    # 4   1  0  4     (1, 2, 1, 3)
    # 5   1  1  2           (1, 3)
    # 6   1  2  4     (2, 1, 2, 2)
    # 7   1  3  1             (2,)
    # 8   2  0  5  (0, 3, 0, 2, 2)
    # 9   2  1  2           (0, 2)
    # 10  2  2  5  (1, 1, 3, 3, 2)
    # 11  2  3  2           (0, 1)
    # 12  3  0  4     (0, 3, 3, 3)
    # 13  3  1  4     (1, 3, 0, 1)
    # 14  3  2  3        (3, 2, 1)
    # 15  3  3  4     (3, 3, 2, 1)
    

提交回复
热议问题