I have a DataFrame ave_data
that contains the following:
ave_data
Time F7 F8 F9
00:00:00 43.005593 -56.509746
@LaangeHaare or anyone else who is curious, I just tested it and the copy part of the accepted answer seems unnecessary (maybe I am missing something...)
so you could simplify this with:
df['average'] = df.mean(numeric_only=True, axis=1)
I would have simply added this as a comment but don't have the reputation
df.assign
is specifically for this purpose. It returns a copy to avoid changing the original dataframe and/or raising SettingWithCopyWarning
. It works as follows:
data_with_ave = ave_data.assign(average = ave_data.mean(axis=1, numeric_only=True))
This function can also create multiple columns at the same time:
data_with_ave = ave_data.assign(
average = ave_data.mean(axis=1, numeric_only=True),
median = ave_data.median(axis=1, numeric_only=True)
)
As of pandas 0.36, you can even reference a column just created to create another:
data_with_ave = ave_data.assign(
average = ave_data.mean(axis=1, numeric_only=True),
isLarge = lambda df: df['average'] > 10
)
You can take a copy of your df using copy()
and then just call mean
and pass params axis=1
and numeric_only=True
so that the mean is calculated row-wise and to ignore non-numeric columns, when you do the following the column is always added at the end:
In [68]:
summary_ave_data = df.copy()
summary_ave_data['average'] = summary_ave_data.mean(numeric_only=True, axis=1)
summary_ave_data
Out[68]:
Time F7 F8 F9 average
0 2015-07-29 00:00:00 43.005593 -56.509746 25.271271 3.922373
1 2015-07-29 01:00:00 55.114918 -59.173852 31.849262 9.263443
2 2015-07-29 02:00:00 63.990762 -64.699492 52.426017 17.239096
In common case if you would like to use specific columns, you can use:
df['average'] = df[['F7','F8']].mean(axis=1)
where axis=1 stands for rowwise action (using column values for each row to calculate the mean in 'average' column)
Then you may want to sort by this column:
df.sort_values(by='average',ascending=False, inplace=True)
where inplace=True stands for applying action to dataframe instead of calculating on the copy.