Iteration over years to plot different group values as bar plot in pandas

依然范特西╮ 提交于 2021-01-29 20:30:45

问题


I have a dataframe that records number of observations at different locations for different years. I am trying to make a barplot where I can show the total number of observations at different locations for different years. For each location, I want the total observations, for different years to be shown in different colors. My approach is to first make location groups and for each location group, calculate total observation. (I don't think I need to change the index to date - as I am grouping by location).I am not able to achieve this using the following code. Help will be much appreciated.

fig, ax = plt.subplots(figsize=(40,15))
date=df['date']
value=df['value']
df.date = pd.to_datetime(df.date)


year_start=2015
year_stop = 2019
#ax=plt.gca()

for year in range(year_start, year_stop+1):
    ax=plt.gca()
    m=df.groupby(['location']).agg({'value': ['count']})


    plt.ylim(0,45000)
    m.plot(kind='bar', legend = False, figsize=(30,15), fontsize = 30)
    #ax.tick_params(axis='both', which='major', labelsize=25)
    plt.ylabel('Number of observations - O3', fontsize = 30, fontweight = 'bold')    

    plt.legend(loc='upper right', prop={'size': 7})
    fig_title='Diurnal_'+place
    plt.savefig(fig_title, format='png',dpi=500, bbox_inches="tight")

    print ('saved=', fig_title)
    plt.show()


The header looks like this:
                             date_utc                       date parameter  \
    212580  {utc=2020-01-05T05:45:00.000Z  2020-01-05T11:15:00+05:30        o3   
    212581  {utc=2020-01-05T05:45:00.000Z  2020-01-05T11:15:00+05:30        o3   
    212582  {utc=2020-01-05T05:45:00.000Z  2020-01-05T11:15:00+05:30        o3   
    212583  {utc=2020-01-05T05:45:00.000Z  2020-01-05T11:15:00+05:30        o3   
    212584  {utc=2020-01-05T05:45:00.000Z  2020-01-05T11:15:00+05:30        o3   

                                               location  value   unit       city  \
    212580        ICRISAT Patancheru, Mumbai - TSPCB   37.7  µg/m³  Hyderabad   
    212581  Bollaram Industrial Area, Surat - TSPCB   39.5  µg/m³  Hyderabad   
    212582          IDA Pashamylaram, Surat - TSPCB   17.8  µg/m³  Hyderabad   
    212583               Sanathnagar, Hyderabad - TSPCB   56.6  µg/m³  Hyderabad   
    212584                  Zoo Park, Hyderabad - TSPCB   24.5  µg/m³  Hyderabad   

回答1:


Since I was not able to fully reproduce your example, I implemented a toy example from what I understood. Please tell me if I understood something wrong. Here is my code:

import seaborn as sns
import numpy as np
import pandas as pd


df = pd.DataFrame([['Mumbai',2017,10],['Mumbai',2017,12],['Mumbai',2018,20],['Mumbai',2018,23],['Abu Dhabi',2017,30],['Abu Dhabi', 2018,25]], columns =['Place','Year','Amount'])

df_grouped = df.groupby(['Place','Year']).agg({'Amount':'count'}).reset_index()

sns.barplot(x='Place',y='Amount',hue='Year',data= df_grouped)

This code will show a barplot, where each location will reside in x-axis and their total counts in y-axis. Moreover, each unique year will get its own bar in the barplot. Like this:



来源:https://stackoverflow.com/questions/59868638/iteration-over-years-to-plot-different-group-values-as-bar-plot-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!