问题
I want to get the average GDP of each country across the years, the columns 2006, 2007...2015 contain the GDP numbers... My code returns an error that mean(axis=1) needs at least 1 variable, and 1 has been assign to it... which is weird..I also find it weird that we are using mean instead of avg, but coulnd't find an avg function for groupby
here is my code
Top15 = ANSWER
Top15 = Top15[['Country', '2006', '2007', '2008', '2009', '2010',
'2011', '2012', '2013', '2014', '2015']]
return Top15.groupby('Country').agg({"avg": np.mean(axis=1)})
回答1:
GroupBy
is not necessary here as you are performing a calculation rather than an aggregation. You can just use pd.DataFrame.mean. Here's a minimal example:
df = pd.DataFrame({'Country': ['UK', 'US'],
'2006': [1, 2],
'2007': [3, 4],
'2008': [5, 6]})
df['mean'] = df[['2006', '2007', '2008']].mean(1)
print(df)
2006 2007 2008 Country mean
0 1 3 5 UK 3.0
1 2 4 6 US 4.0
回答2:
Use mean()
Top15 = ANSWER
Top15 = Top15[['Country', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']]
return Top15.groupby('Country').mean()
回答3:
There are multiple problems with your code:
.agg
with a dict maps input columns to aggregation type, like.agg({'2016': 'mean'})
np.mean(axis=1)
tries to evaluate something, but you did not provide an input..agg({'2016': lambda x: np.mean(x)})
would work- the easiest way would be
Top15.groupby('Country').mean()
(read it as "group by Country and for each group calculate the mean (avg)")
来源:https://stackoverflow.com/questions/51826751/average-of-dataframe-columns