sum values of columns starting with the same string in pandas dataframe

前端 未结 2 865
小鲜肉
小鲜肉 2021-01-12 00:47

I have a dataframe with about 100 columns that looks like this:

   Id  Economics-1  English-107  English-2  History-3  Economics-zz  Economics-2  \\
0  56            


        
2条回答
  •  悲&欢浪女
    2021-01-12 01:15

    I'd suggest that you do something different, which is to perform a transpose, groupby the prefix of the rows (your original columns), sum, and transpose again.

    Consider the following:

    df = pd.DataFrame({
            'a_a': [1, 2, 3, 4],
            'a_b': [2, 3, 4, 5],
            'b_a': [1, 2, 3, 4],
            'b_b': [2, 3, 4, 5],
        })
    

    Now

    [s.split('_')[0] for s in df.T.index.values]
    

    is the prefix of the columns. So

    >>> df.T.groupby([s.split('_')[0] for s in df.T.index.values]).sum().T
        a   b
    0   3   3
    1   5   5
    2   7   7
    3   9   9
    

    does what you want.

    In your case, make sure to split using the '-' character.

提交回复
热议问题