Grouping a dataframe by X columns

后端未结

关注

 1  1552

無奈伤痛

I have a dataframe and I\'d like to apply a function to each 2 columns (or 3, it\'s variable).

For example with the following DataFrame, I\'d like to a

相关标签:

1条回答

栀梦

2020-12-22 04:07

groupby can work on axis=1 as well, and can accept a sequence of group labels. If your columns are convenient ranges like in your example, it's trivial:

>>> df = pd.DataFrame((np.random.randn(6*6)).reshape(6,6))
>>> df
          0         1         2         3         4         5
0  1.705550 -0.757193 -0.636333  2.097570 -1.064751  0.450812
1  0.575623 -0.385987  0.105516  0.820795 -0.464069  0.728609
2  0.776840 -0.173348  0.878534  0.995937  0.094515  0.098853
3  0.326854  1.297625  2.232534  1.004719 -0.440271  1.548430
4  0.483211 -1.182175 -0.012520 -1.766317 -0.895284 -0.695300
5  0.523011 -1.653557  1.022042  1.201774 -1.118465  1.400537
>>> df.groupby(df.columns//2, axis=1).mean()
          0         1         2
0  0.474179  0.730618 -0.306970
1  0.094818  0.463155  0.132270
2  0.301746  0.937235  0.096684
3  0.812239  1.618627  0.554080
4 -0.349482 -0.889419 -0.795292
5 -0.565273  1.111908  0.141036

(This works because df.columns//2 gives Int64Index([0, 0, 1, 1, 2, 2], dtype='int64').)

Even if we're not so fortunate, we can still build the appropriate groups ourselves:

>>> df.groupby(np.arange(df.columns.size)//2, axis=1).mean()
          0         1         2
0  0.474179  0.730618 -0.306970
1  0.094818  0.463155  0.132270
2  0.301746  0.937235  0.096684
3  0.812239  1.618627  0.554080
4 -0.349482 -0.889419 -0.795292
5 -0.565273  1.111908  0.141036

0 讨论(0)