Mean, Median, and mode of a list of values (SCORE) given a certain zip code for every year

前端 未结 2 449
春和景丽
春和景丽 2021-01-29 04:57

I want to find the mean, median and mode value for each year given a specific ZIP code how can I achieve this, I already read the data from CSV file and convert it to json file

相关标签:
2条回答
  • 2021-01-29 05:16

    Use SciPy.mstats:

    In [2295]: df.DATE = pd.to_datetime(df.DATE).dt.year
    
    In [2291]: import scipy.stats.mstats as mstats
    
    In [2313]: def mode(x):
          ...:     return mstats.mode(x, axis=None)[0]
          ...: 
    
     In [2314]: df.groupby(['DATE', 'ZipCodes']).agg(["mean","median", mode])
    Out[2314]: 
                  SCORE            
                   mean median mode
    DATE ZipCodes                  
    2017 44        88.0   88.0   88
         55        90.0   90.0   90
         66        92.5   92.5   90
         77        96.0   96.0   96
    2018 33        90.0   90.0   90
         55        92.0   92.0   92
         66        97.0   97.0   97
    2019 55        96.0   96.0   96
         77        90.0   90.0   90
    
    0 讨论(0)
  • 2021-01-29 05:32

    you could use groupby to group the data by date and zipcode and then use the .agg function to apply the mean, median and mode to it. The code would look as follow

    groupedData = df.groupby(["DATE","Zip codes"]).agg({"Score" : ["mean","median","mode"]
    
    0 讨论(0)
提交回复
热议问题