Average of Dataframe columns

妖精的绣舞 提交于 2021-01-01 03:00:58

问题


I want to get the average GDP of each country across the years, the columns 2006, 2007...2015 contain the GDP numbers... My code returns an error that mean(axis=1) needs at least 1 variable, and 1 has been assign to it... which is weird..I also find it weird that we are using mean instead of avg, but coulnd't find an avg function for groupby

here is my code

    Top15 = ANSWER
    Top15 = Top15[['Country', '2006', '2007', '2008', '2009', '2010', 
    '2011', '2012', '2013', '2014', '2015']]
    return Top15.groupby('Country').agg({"avg": np.mean(axis=1)})

回答1:


GroupBy is not necessary here as you are performing a calculation rather than an aggregation. You can just use pd.DataFrame.mean. Here's a minimal example:

df = pd.DataFrame({'Country': ['UK', 'US'],
                   '2006': [1, 2],
                   '2007': [3, 4],
                   '2008': [5, 6]})

df['mean'] = df[['2006', '2007', '2008']].mean(1)

print(df)

   2006  2007  2008 Country  mean
0     1     3     5      UK   3.0
1     2     4     6      US   4.0



回答2:


Use mean()

Top15 = ANSWER
Top15 = Top15[['Country', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']]    
return Top15.groupby('Country').mean()



回答3:


There are multiple problems with your code:

  1. .agg with a dict maps input columns to aggregation type, like .agg({'2016': 'mean'})
  2. np.mean(axis=1) tries to evaluate something, but you did not provide an input. .agg({'2016': lambda x: np.mean(x)}) would work
  3. the easiest way would be Top15.groupby('Country').mean() (read it as "group by Country and for each group calculate the mean (avg)")


来源:https://stackoverflow.com/questions/51826751/average-of-dataframe-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!