I have a dataframe and I\'d like to apply a function to each 2 columns (or 3, it\'s variable).
For example with the following DataFrame
, I\'d like to a
groupby
can work on axis=1
as well, and can accept a sequence of group labels. If your columns are convenient ranges like in your example, it's trivial:
>>> df = pd.DataFrame((np.random.randn(6*6)).reshape(6,6))
>>> df
0 1 2 3 4 5
0 1.705550 -0.757193 -0.636333 2.097570 -1.064751 0.450812
1 0.575623 -0.385987 0.105516 0.820795 -0.464069 0.728609
2 0.776840 -0.173348 0.878534 0.995937 0.094515 0.098853
3 0.326854 1.297625 2.232534 1.004719 -0.440271 1.548430
4 0.483211 -1.182175 -0.012520 -1.766317 -0.895284 -0.695300
5 0.523011 -1.653557 1.022042 1.201774 -1.118465 1.400537
>>> df.groupby(df.columns//2, axis=1).mean()
0 1 2
0 0.474179 0.730618 -0.306970
1 0.094818 0.463155 0.132270
2 0.301746 0.937235 0.096684
3 0.812239 1.618627 0.554080
4 -0.349482 -0.889419 -0.795292
5 -0.565273 1.111908 0.141036
(This works because df.columns//2
gives Int64Index([0, 0, 1, 1, 2, 2], dtype='int64')
.)
Even if we're not so fortunate, we can still build the appropriate groups ourselves:
>>> df.groupby(np.arange(df.columns.size)//2, axis=1).mean()
0 1 2
0 0.474179 0.730618 -0.306970
1 0.094818 0.463155 0.132270
2 0.301746 0.937235 0.096684
3 0.812239 1.618627 0.554080
4 -0.349482 -0.889419 -0.795292
5 -0.565273 1.111908 0.141036