matplotlib dataframe 2 column [dates, non-numerical-data] stacked bar chart defining attributes

守給你的承諾、 提交于 2020-06-17 13:19:08

问题


Background:

I have managed to create the following graph, but I have difficulty with some of the elements

Disclaimer:

This graph below is what I want to achieve, however I would like integrate my questions into the graph If there is an alternative to obtaining a stacked graph with all the dates, please free to share the code with me.

Question:

How can I define the following:

  • Make the bars more wide
  • Make the y-axis integers
  • Change the date format (to %a %d/%b/%y ) of the x-axis
  • Define the chart size (400 by 800) (it's a little small as I think the dates are getting cut off)
  • Add a this is my chart title to the chart
  • Add labels (this is x axis, this is y-axis) to the x & y axis ?

MWE:

import datetime as dt
import mysql.connector
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime


mycursor.execute(query)
data = mycursor.fetchall()

df = pd.DataFrame(data, columns=['date', 'Operation'])

df['date'] = pd.to_datetime(df.date)

all_dates = pd.date_range('2020-05-01','2020-05-31', freq='D').date

(pd.crosstab(df.date,df.Operation)
.reindex(all_dates)
.plot.bar(stacked=True, color=COLOR_LIST)
)

filename = "\\TEST_month_of_{}.png".format("May").lower()
plt.savefig(CURRENT_DIRECTORY + filename)
print("\n\nGenerated: {}".format(CURRENT_DIRECTORY + filename))

Data set:

print(df) yields the following:

date          Operation
2020-05-07        A
2020-05-08        B
2020-05-08        A
2020-05-12        A
2020-05-12        A
2020-05-12        B
2020-05-13        C
2020-05-13        A
2020-05-13        B
2020-05-14        A
2020-05-19        B
2020-05-21        A
2020-05-25        A
2020-05-26        B
2020-05-26        C
2020-05-26        A
2020-05-26        A
2020-05-29        A

回答1:


As for the date format, I couldn't do what I wanted because of the different locales. Also, the only way to make the width of the bar thicker is to thin out the number of data, so unnecessary lines are removed.

import pandas as pd
import numpy as np
import io

data = '''
date Operation
2020-05-07 A
2020-05-08 B
2020-05-08 A
2020-05-12 A
2020-05-12 A
2020-05-12 B
2020-05-13 C
2020-05-13 A
2020-05-13 B
2020-05-14 A
2020-05-19 B
2020-05-21 A
2020-05-25 A
2020-05-26 B
2020-05-26 C
2020-05-26 A
2020-05-26 A
2020-05-29 A
'''

df = pd.read_csv(io.StringIO(data), sep='\s+')
df['date'] = pd.to_datetime(df['date'])
all_dates = pd.date_range('2020-05-01','2020-05-31', freq='D').date
df2 = pd.crosstab(df.date,df.Operation).reindex(all_dates)

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(4,8),dpi=100) # Define the chart size (400 by 800)

df2.dropna(inplace=True) # Make the bars more wide

ax = df2.plot.bar(stacked=True)

ax.set_title('this is my chart') # Add a this is my chart title to the chart

ax.set_xlabel('this is x-axis') # Add labels (this is x axis, this is y-axis) to the x & y axis ?
ax.set_ylabel('this is y-axis')

start, end = ax.get_ylim()
ax.yaxis.set_ticks(np.arange(start, end, 1)) # Make the y-axis integers



来源:https://stackoverflow.com/questions/62205274/matplotlib-dataframe-2-column-dates-non-numerical-data-stacked-bar-chart-defi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!