Summing up more than two dataframes with the same indexes in Pandas

青春壹個敷衍的年華 提交于 2019-12-20 02:26:11

问题


I want to add values of 4 Dataframes with the same indexes in Pandas. If there are two dataframes, df1 and df2, we may write:

df1.add(df2)

and for 3 dataframes:

df3.add(df2.add(df1))

I wonder if there is a more general way to do so in Python.


回答1:


Option 1
Use sum

sum([df1, df2, df3, df4])

Option 2
Use reduce

from functools import reduce

reduce(pd.DataFrame.add, [df1, df2, df3, df4])

Option 3
Use pd.concat and pd.DataFrame.sum with level=1
This only works if there is a single level to the dataframe indices. We've have to get a little more cute to make it work. I recommend the other options.

pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

Setup

df = pd.DataFrame([[1, -1], [complex(0, 1), complex(0, -1)]])
df1, df2, df3, df4 = [df] * 4

Demo

sum([df1, df2, df3, df4])

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

from functools import reduce

reduce(pd.DataFrame.add, [df1, df2, df3, df4])

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

Timing

small data

%timeit sum([df1, df2, df3, df4])
%timeit reduce(pd.DataFrame.add, [df1, df2, df3, df4])
%timeit pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

1000 loops, best of 3: 591 µs per loop
1000 loops, best of 3: 456 µs per loop
100 loops, best of 3: 3.61 ms per loop

larger data

df = pd.DataFrame([[1, -1], [complex(0, 1), complex(0, -1)]])
df = pd.concat([df] * 1000, ignore_index=True)
df = pd.concat([df] * 100, axis=1, ignore_index=True)
df1, df2, df3, df4 = [df] * 4

%timeit sum([df1, df2, df3, df4])
%timeit reduce(pd.DataFrame.add, [df1, df2, df3, df4])
%timeit pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

100 loops, best of 3: 3.94 ms per loop
100 loops, best of 3: 2.9 ms per loop
1 loop, best of 3: 1min per loop


来源:https://stackoverflow.com/questions/44973981/summing-up-more-than-two-dataframes-with-the-same-indexes-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!