Add column to the end of Pandas DataFrame containing average of previous data

前端 未结 4 1297
深忆病人
深忆病人 2021-02-12 14:04

I have a DataFrame ave_data that contains the following:

ave_data

Time        F7           F8            F9  
00:00:00    43.005593    -56.509746           


        
相关标签:
4条回答
  • 2021-02-12 14:31

    @LaangeHaare or anyone else who is curious, I just tested it and the copy part of the accepted answer seems unnecessary (maybe I am missing something...)

    so you could simplify this with:

    df['average'] = df.mean(numeric_only=True, axis=1)
    

    I would have simply added this as a comment but don't have the reputation

    0 讨论(0)
  • 2021-02-12 14:36

    df.assign is specifically for this purpose. It returns a copy to avoid changing the original dataframe and/or raising SettingWithCopyWarning. It works as follows:

    data_with_ave = ave_data.assign(average = ave_data.mean(axis=1, numeric_only=True))
    

    This function can also create multiple columns at the same time:

    data_with_ave = ave_data.assign(
                        average = ave_data.mean(axis=1, numeric_only=True),
                        median = ave_data.median(axis=1, numeric_only=True)
    )
    

    As of pandas 0.36, you can even reference a column just created to create another:

    data_with_ave = ave_data.assign(
                        average = ave_data.mean(axis=1, numeric_only=True),
                        isLarge = lambda df: df['average'] > 10
    )
    
    0 讨论(0)
  • 2021-02-12 14:50

    You can take a copy of your df using copy() and then just call mean and pass params axis=1 and numeric_only=True so that the mean is calculated row-wise and to ignore non-numeric columns, when you do the following the column is always added at the end:

    In [68]:
    
    summary_ave_data = df.copy()
    summary_ave_data['average'] = summary_ave_data.mean(numeric_only=True, axis=1)
    summary_ave_data
    Out[68]:
                     Time         F7         F8         F9    average
    0 2015-07-29 00:00:00  43.005593 -56.509746  25.271271   3.922373
    1 2015-07-29 01:00:00  55.114918 -59.173852  31.849262   9.263443
    2 2015-07-29 02:00:00  63.990762 -64.699492  52.426017  17.239096
    
    0 讨论(0)
  • 2021-02-12 14:52

    In common case if you would like to use specific columns, you can use:

    df['average'] = df[['F7','F8']].mean(axis=1)
    

    where axis=1 stands for rowwise action (using column values for each row to calculate the mean in 'average' column)

    Then you may want to sort by this column:

    df.sort_values(by='average',ascending=False, inplace=True)
    

    where inplace=True stands for applying action to dataframe instead of calculating on the copy.

    0 讨论(0)
提交回复
热议问题