问题
I have a dataframe that records number of observations at different locations for different years. I am trying to make a barplot where I can show the total number of observations at different locations for different years. For each location, I want the total observations, for different years to be shown in different colors. My approach is to first make location groups and for each location group, calculate total observation. (I don't think I need to change the index to date - as I am grouping by location).I am not able to achieve this using the following code. Help will be much appreciated.
fig, ax = plt.subplots(figsize=(40,15))
date=df['date']
value=df['value']
df.date = pd.to_datetime(df.date)
year_start=2015
year_stop = 2019
#ax=plt.gca()
for year in range(year_start, year_stop+1):
ax=plt.gca()
m=df.groupby(['location']).agg({'value': ['count']})
plt.ylim(0,45000)
m.plot(kind='bar', legend = False, figsize=(30,15), fontsize = 30)
#ax.tick_params(axis='both', which='major', labelsize=25)
plt.ylabel('Number of observations - O3', fontsize = 30, fontweight = 'bold')
plt.legend(loc='upper right', prop={'size': 7})
fig_title='Diurnal_'+place
plt.savefig(fig_title, format='png',dpi=500, bbox_inches="tight")
print ('saved=', fig_title)
plt.show()
The header looks like this:
date_utc date parameter \
212580 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212581 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212582 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212583 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212584 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
location value unit city \
212580 ICRISAT Patancheru, Mumbai - TSPCB 37.7 µg/m³ Hyderabad
212581 Bollaram Industrial Area, Surat - TSPCB 39.5 µg/m³ Hyderabad
212582 IDA Pashamylaram, Surat - TSPCB 17.8 µg/m³ Hyderabad
212583 Sanathnagar, Hyderabad - TSPCB 56.6 µg/m³ Hyderabad
212584 Zoo Park, Hyderabad - TSPCB 24.5 µg/m³ Hyderabad
回答1:
Since I was not able to fully reproduce your example, I implemented a toy example from what I understood. Please tell me if I understood something wrong. Here is my code:
import seaborn as sns
import numpy as np
import pandas as pd
df = pd.DataFrame([['Mumbai',2017,10],['Mumbai',2017,12],['Mumbai',2018,20],['Mumbai',2018,23],['Abu Dhabi',2017,30],['Abu Dhabi', 2018,25]], columns =['Place','Year','Amount'])
df_grouped = df.groupby(['Place','Year']).agg({'Amount':'count'}).reset_index()
sns.barplot(x='Place',y='Amount',hue='Year',data= df_grouped)
This code will show a barplot, where each location will reside in x-axis and their total counts in y-axis. Moreover, each unique year will get its own bar in the barplot. Like this:
来源:https://stackoverflow.com/questions/59868638/iteration-over-years-to-plot-different-group-values-as-bar-plot-in-pandas