问题
I have the following dataframe:
meteo = [["January", 9.2, 13.6, 4.7, 37, 70],
["February",9.9, 14.3, 5.4, 35, 70],
["March", 11.8, 16.1, 7.4, 36, 70],
["April", 13.7, 18.0, 9.4, 40, 69],
["May", 16.9, 21.1, 12.8, 47, 70],
["June", 20.9, 24.9, 16.8, 30, 68],
["July", 23.9, 28.0, 19.8, 21, 67],
["August", 24.4, 28.5, 20.2, 62, 68],
["September", 21.7, 26.0, 17.4, 81, 70],
["October", 17.8, 22.1, 13.5, 91, 73],
["November", 13.0, 17.3, 8.6, 59, 71],
["December", 10.0, 14.3, 5.7, 40, 69]]
import pandas as pd
# Create dataframe with above data
df = pd.DataFrame(meteo)
# Drop useless column
df.drop(0, inplace = True, axis = 1)
# Rename columns
df.rename(columns = {1: "Temp_media_anual_mes", 2: "Temp_máxima_media", 3: "Temp_mínima_media", 4: "Media_lluvias_mensual", 5:"Humedad_media_rel"}, inplace = True)
df["mes"] = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
Now, i would like to plot a grouped bar chart. I would like to have 5 grouped bars per month. I have tried this, but i have a little problem with the spaces between bars:
# Setting the positions and width for the bars
pos = list(range(len(df.mes)))
width = 0.25
# Plotting the bars
fig, ax = plt.subplots(figsize=(16,10))
# Create a bar with pre_score data,
# in position pos,
plt.bar(pos,
#using df['pre_score'] data,
df['Temp_media_anual_mes'],
# of width
width,
# with alpha 0.5
alpha=0.5,
# with color
color='red')
# with label the first value in first_name
#label=df['first_name'][0])
# Create a bar with mid_score data,
# in position pos + some width buffer,
plt.bar([p + width for p in pos],
#using df['mid_score'] data,
df['Temp_máxima_media'],
# of width
width,
# with alpha 0.5
alpha=0.5,
# with color
color='green')
# with label the second value in first_name
#label=df['first_name'][1])
# Create a bar with post_score data,
# in position pos + some width buffer,
plt.bar([p + width*2 for p in pos],
#using df['post_score'] data,
df['Temp_mínima_media'],
# of width
width,
# with alpha 0.5
alpha=0.5,
# with color
color='blue')
# with label the third value in first_name
#label=df['first_name'][2])
plt.bar([p + width*2 for p in pos],
#using df['post_score'] data,
df['Media_lluvias_mensual'],
# of width
width,
# with alpha 0.5
alpha=0.5,
# with color
color='orange')
# with label the third value in first_name
#label=df['first_name'][2])
plt.bar([p + width*2 for p in pos],
#using df['post_score'] data,
df['Humedad_media_rel'],
# of width
width,
# with alpha 0.5
alpha=0.5,
# with color
color='purple')
# with label the third value in first_name
#label=df['first_name'][2])
# Set the y axis label
ax.set_ylabel('Ammount')
# Set the chart's title
ax.set_title('Rain and temperature')
# Set the position of the x ticks
ax.set_xticks([p + 1.5 * width for p in pos])
# Set the labels for the x ticks
ax.set_xticklabels(df['mes'])
# Setting the x-axis and y-axis limits
plt.xlim(min(pos)-width, max(pos)+width*4)
plt.ylim([0, max(df['Temp_media_anual_mes'] + df['Temp_máxima_media'] + df['Temp_mínima_media'] + df["Media_lluvias_mensual"] + df["Humedad_media_rel"])] )
plt.grid()
plt.show()
This is the plot i'm getting
As you can see, it's showing 3 separate bars, and in the third one, there are 3 bars one behind another. I know the issue is in the spacing between the bars, but i don't know how to fix it. Could someone point me in the right direction please?
EDIT:
I would also like to display above of each bar, the measurements units of each plotted values. These are:
- Celsius degrees for temperatures
- mm for precipitation amounts
- % for relative humidity
Thank you very much in advance
回答1:
Here is some code that places the bars, puts the month name centered, ...
Note that the original calculation for ylim was wrong, it should not be the sum of the maxima but the maximum of the maxima. I also added some text with the units above the columns. I tried to find some suitable colors: red-yellowish for the temperatures, blue for rain, blue greenish for humidity.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
meteo = [["January", 9.2, 13.6, 4.7, 37, 70],
["February", 9.9, 14.3, 5.4, 35, 70],
["March", 11.8, 16.1, 7.4, 36, 70],
["April", 13.7, 18.0, 9.4, 40, 69],
["May", 16.9, 21.1, 12.8, 47, 70],
["June", 20.9, 24.9, 16.8, 30, 68],
["July", 23.9, 28.0, 19.8, 21, 67],
["August", 24.4, 28.5, 20.2, 62, 68],
["September", 21.7, 26.0, 17.4, 81, 70],
["October", 17.8, 22.1, 13.5, 91, 73],
["November", 13.0, 17.3, 8.6, 59, 71],
["December", 10.0, 14.3, 5.7, 40, 69]]
df = pd.DataFrame(meteo)
#df.rename(columns = {0:"mes", 1: "Temp. media mes", 2: "Temp. máxima media", 3: "Temp. mínima media", 4: "Media lluvias mensual", 5:"Humedad media rel"}, inplace = True)
df.rename(columns = {0:"month", 1: "Mean monthly temperature", 2: "Max. monthly temperature", 3: "Min. monthly temperature", 4: "Mean monthly rainfall", 5:"Mean relative humidity"}, inplace = True)
# Setting the positions and width for the bars
pos = list(range(len(df)))
num_col = len(df.columns) - 1
width = 0.95 / num_col
fig, ax = plt.subplots(figsize=(16,10))
bar_colors = ['#feb24c', '#f03b20', '#ffeda0', '#43a2ca', '#a8ddb5']
bar_labels = df.columns[1:]
for i, (colname, color, lbl) in enumerate(zip(df.columns[1:], bar_colors, bar_labels)):
delta_p = 0.125 + width*i
plt.bar([p + delta_p for p in pos],
df[colname], width, color=color, label=lbl)
for j in range(len(df)):
ax.annotate("°C" if i < 3 else "mm" if i == 3 else "%",
xy=(pos[j] + delta_p, df[colname][j] + 1),
ha='center')
ax.set_ylabel('Amount')
ax.set_title('Temperatures, Rain and Humidity')
ax.set_xticks(pos)
def update_ticks(x, pos):
return df['month'][pos]
ax.xaxis.set_major_formatter(ticker.NullFormatter())
ax.xaxis.set_minor_formatter(ticker.FuncFormatter(update_ticks))
ax.xaxis.set_minor_locator(ticker.FixedLocator([p+0.5 for p in pos]))
for tick in ax.xaxis.get_minor_ticks():
tick.tick1line.set_markersize(0)
tick.tick2line.set_markersize(0)
tick.label1.set_horizontalalignment('center')
plt.xlim(min(pos), max(pos)+1)
plt.ylim([0, 10+max([max(df[colname]) for colname in df.columns[1:]])])
plt.legend()
plt.grid()
plt.show()
回答2:
You don't need that many plot calls. You can do it in one go.
>>> ax = df.plot.bar(x='mes', y=list(df.columns[1:6]))
>>> plt.show()
Regarding displaying value above each bar, you can refer to this post where I have explained how to add text to the top of histogram. You can do the same for the bar plot too.
How can I add the counts to the histogram plot?
来源:https://stackoverflow.com/questions/59066811/how-can-a-plot-a-5-grouped-bars-bar-chart-in-matplotlib