问题
I have a dataframe that looks like this:
I have used a barplot to represent the subscribers for each row. This is what I did:
data = channels.sort_values('subscribers', ascending=False).head(5)
chart = sns.barplot(x = 'name', y='subscribers',data=data)
chart.set_xticklabels(chart.get_xticklabels(), rotation=90)
for p in chart.patches:
chart.annotate("{:,.2f}".format(p.get_height(), '.2f'), (p.get_x() + p.get_width() / 2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 10), textcoords = 'offset points')
Now I want to show the 'video_count' for each user on this same plot. The goal is to compare how the number of subscribers relate to the number of videos. How can I depict this on the chart?
回答1:
- The data needs to be converted to a long format using .stack
- Because of the scale of values,
'log'
is used for the yscale - All of the categories in
'cats'
are included for the example.- Select only the desired columns before stacking, or use
dfl = dfl[dfl.cats.isin(['sub', 'vc'])
to filter for the desired'cats'
.
- Select only the desired columns before stacking, or use
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# setup dataframe
data = {'vc': [76, 47, 140, 106, 246], 'tv': [29645400, 28770702, 50234486, 30704017, 272551386], 'sub': [66100, 15900, 44500, 37000, 76700], 'name': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data)
# convert to long
dfl = df.set_index('name').stack().reset_index().rename(columns={'level_1': 'cats', 0: 'values'}).sort_values('values', ascending=False).reset_index(drop=True)
# plot
chart = sns.barplot(x='name', y='values', data=dfl, hue='cats')
chart.set_xticklabels(chart.get_xticklabels(), rotation=90)
plt.yscale('log')
for p in chart.patches:
chart.annotate("{:,.2f}".format(p.get_height(), '.2f'), (p.get_x() + p.get_width() / 2., p.get_height()), ha = 'center', va = 'center', xytext = (0, 10), textcoords = 'offset points')
来源:https://stackoverflow.com/questions/63220741/how-to-include-multiple-data-columns-in-a-seaborn-barplot