Averaging data from multiple data files in Python with pandas

前端未结

关注

 3  1945

I have 30 csv data files from 30 replicate runs of an experiment I ran. I am using pandas\' read_csv() function to read the data into a list of DataFrames. I wo

相关标签:

3条回答

予麋鹿

2021-01-14 12:11

Check it out:

In [14]: glued = pd.concat([x, y], axis=1, keys=['x', 'y'])

In [15]: glued
Out[15]: 
          x                             y                    
          A         B         C         A         B         C
0 -0.264438 -1.026059 -0.619500  1.923135  0.135355 -0.285491
1  0.927272  0.302904 -0.032399 -0.208940  0.642432 -0.764902
2 -0.264273 -0.386314 -0.217601  1.477419 -1.659804 -0.431375
3 -0.871858 -0.348382  1.100491 -1.191664  0.152576  0.935773

In [16]: glued.swaplevel(0, 1, axis=1).sortlevel(axis=1)
Out[16]: 
          A                   B                   C          
          x         y         x         y         x         y
0 -0.264438  1.923135 -1.026059  0.135355 -0.619500 -0.285491
1  0.927272 -0.208940  0.302904  0.642432 -0.032399 -0.764902
2 -0.264273  1.477419 -0.386314 -1.659804 -0.217601 -0.431375
3 -0.871858 -1.191664 -0.348382  0.152576  1.100491  0.935773

In [17]: glued = glued.swaplevel(0, 1, axis=1).sortlevel(axis=1)

In [18]: glued
Out[18]: 
          A                   B                   C          
          x         y         x         y         x         y
0 -0.264438  1.923135 -1.026059  0.135355 -0.619500 -0.285491
1  0.927272 -0.208940  0.302904  0.642432 -0.032399 -0.764902
2 -0.264273  1.477419 -0.386314 -1.659804 -0.217601 -0.431375
3 -0.871858 -1.191664 -0.348382  0.152576  1.100491  0.935773

For the record, swapping the level and reordering was not necessary, just for visual purposes.

Then you can do stuff like:

In [19]: glued.groupby(level=0, axis=1).mean()
Out[19]: 
          A         B         C
0  0.829349 -0.445352 -0.452496
1  0.359166  0.472668 -0.398650
2  0.606573 -1.023059 -0.324488
3 -1.031761 -0.097903  1.018132

0 讨论(0)

青春惊慌失措

2021-01-14 12:15

Have a look at the pandas.concat() function. When you read in your files, you can use concat to join the resulting DataFrames into one, then just use normal pandas averaging techniques to average them.

To use it, just pass it a list of the DataFrames you want joined together:

>>> x
          A         B         C
0 -0.264438 -1.026059 -0.619500
1  0.927272  0.302904 -0.032399
2 -0.264273 -0.386314 -0.217601
3 -0.871858 -0.348382  1.100491
>>> y
          A         B         C
0  1.923135  0.135355 -0.285491
1 -0.208940  0.642432 -0.764902
2  1.477419 -1.659804 -0.431375
3 -1.191664  0.152576  0.935773
>>> pandas.concat([x, y])
          A         B         C
0 -0.264438 -1.026059 -0.619500
1  0.927272  0.302904 -0.032399
2 -0.264273 -0.386314 -0.217601
3 -0.871858 -0.348382  1.100491
0  1.923135  0.135355 -0.285491
1 -0.208940  0.642432 -0.764902
2  1.477419 -1.659804 -0.431375
3 -1.191664  0.152576  0.935773

0 讨论(0)

我寻月下人不归

2021-01-14 12:27
I figured out one way to do it.

pandas DataFrames can be added together with the DataFrame.add() function: http://pandas.sourceforge.net/generated/pandas.DataFrame.add.html

So I can add the DataFrames together then divide by the number of DataFrames, e.g.:
```
avgDataFrame = DataFrameList[0]

for i in range(1, len(DataFrameList)):
    avgDataFrame = avgDataFrame.add(DataFrameList[i])

avgDataFrame = avgDataFrame / len(DataFrameList)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...