问题
I want to find the mean, median and mode value for each year given a specific ZIP code how can I achieve this, I already read the data from CSV file and convert it to json file and define it as DataFrame my data sample is not limited to the following table it's larger
回答1:
Use SciPy.mstats:
In [2295]: df.DATE = pd.to_datetime(df.DATE).dt.year
In [2291]: import scipy.stats.mstats as mstats
In [2313]: def mode(x):
...: return mstats.mode(x, axis=None)[0]
...:
In [2314]: df.groupby(['DATE', 'ZipCodes']).agg(["mean","median", mode])
Out[2314]:
SCORE
mean median mode
DATE ZipCodes
2017 44 88.0 88.0 88
55 90.0 90.0 90
66 92.5 92.5 90
77 96.0 96.0 96
2018 33 90.0 90.0 90
55 92.0 92.0 92
66 97.0 97.0 97
2019 55 96.0 96.0 96
77 90.0 90.0 90
回答2:
you could use groupby to group the data by date and zipcode and then use the .agg function to apply the mean, median and mode to it. The code would look as follow
groupedData = df.groupby(["DATE","Zip codes"]).agg({"Score" : ["mean","median","mode"]
来源:https://stackoverflow.com/questions/65439932/mean-median-and-mode-of-a-list-of-values-score-given-a-certain-zip-code-for