Averaging data from multiple data files in Python with pandas

前端 未结 3 1945
南笙
南笙 2021-01-14 11:32

I have 30 csv data files from 30 replicate runs of an experiment I ran. I am using pandas\' read_csv() function to read the data into a list of DataFrames. I wo

相关标签:
3条回答
  • 2021-01-14 12:11

    Check it out:

    In [14]: glued = pd.concat([x, y], axis=1, keys=['x', 'y'])
    
    In [15]: glued
    Out[15]: 
              x                             y                    
              A         B         C         A         B         C
    0 -0.264438 -1.026059 -0.619500  1.923135  0.135355 -0.285491
    1  0.927272  0.302904 -0.032399 -0.208940  0.642432 -0.764902
    2 -0.264273 -0.386314 -0.217601  1.477419 -1.659804 -0.431375
    3 -0.871858 -0.348382  1.100491 -1.191664  0.152576  0.935773
    
    In [16]: glued.swaplevel(0, 1, axis=1).sortlevel(axis=1)
    Out[16]: 
              A                   B                   C          
              x         y         x         y         x         y
    0 -0.264438  1.923135 -1.026059  0.135355 -0.619500 -0.285491
    1  0.927272 -0.208940  0.302904  0.642432 -0.032399 -0.764902
    2 -0.264273  1.477419 -0.386314 -1.659804 -0.217601 -0.431375
    3 -0.871858 -1.191664 -0.348382  0.152576  1.100491  0.935773
    
    In [17]: glued = glued.swaplevel(0, 1, axis=1).sortlevel(axis=1)
    
    In [18]: glued
    Out[18]: 
              A                   B                   C          
              x         y         x         y         x         y
    0 -0.264438  1.923135 -1.026059  0.135355 -0.619500 -0.285491
    1  0.927272 -0.208940  0.302904  0.642432 -0.032399 -0.764902
    2 -0.264273  1.477419 -0.386314 -1.659804 -0.217601 -0.431375
    3 -0.871858 -1.191664 -0.348382  0.152576  1.100491  0.935773
    

    For the record, swapping the level and reordering was not necessary, just for visual purposes.

    Then you can do stuff like:

    In [19]: glued.groupby(level=0, axis=1).mean()
    Out[19]: 
              A         B         C
    0  0.829349 -0.445352 -0.452496
    1  0.359166  0.472668 -0.398650
    2  0.606573 -1.023059 -0.324488
    3 -1.031761 -0.097903  1.018132
    
    0 讨论(0)
  • 2021-01-14 12:15

    Have a look at the pandas.concat() function. When you read in your files, you can use concat to join the resulting DataFrames into one, then just use normal pandas averaging techniques to average them.

    To use it, just pass it a list of the DataFrames you want joined together:

    >>> x
              A         B         C
    0 -0.264438 -1.026059 -0.619500
    1  0.927272  0.302904 -0.032399
    2 -0.264273 -0.386314 -0.217601
    3 -0.871858 -0.348382  1.100491
    >>> y
              A         B         C
    0  1.923135  0.135355 -0.285491
    1 -0.208940  0.642432 -0.764902
    2  1.477419 -1.659804 -0.431375
    3 -1.191664  0.152576  0.935773
    >>> pandas.concat([x, y])
              A         B         C
    0 -0.264438 -1.026059 -0.619500
    1  0.927272  0.302904 -0.032399
    2 -0.264273 -0.386314 -0.217601
    3 -0.871858 -0.348382  1.100491
    0  1.923135  0.135355 -0.285491
    1 -0.208940  0.642432 -0.764902
    2  1.477419 -1.659804 -0.431375
    3 -1.191664  0.152576  0.935773
    
    0 讨论(0)
  • 2021-01-14 12:27

    I figured out one way to do it.

    pandas DataFrames can be added together with the DataFrame.add() function: http://pandas.sourceforge.net/generated/pandas.DataFrame.add.html

    So I can add the DataFrames together then divide by the number of DataFrames, e.g.:

    avgDataFrame = DataFrameList[0]
    
    for i in range(1, len(DataFrameList)):
        avgDataFrame = avgDataFrame.add(DataFrameList[i])
    
    avgDataFrame = avgDataFrame / len(DataFrameList)
    
    0 讨论(0)
提交回复
热议问题