Suppose I have a DataFrame of created like this:
import pandas as pd
s1 = pd.Series([\'a\', \'b\', \'a\', \'c\', \'a\', \'b\'])
s2 = pd.Series([\'a\', \'f\',
Recreating the dataframe:
import pandas as pd
s1 = pd.Series(['a', 'b', 'a', 'c', 'a', 'b'])
s2 = pd.Series(['a', 'f', 'a', 'd', 'a', 'f', 'f'])
d = pd.DataFrame({'s1': s1, 's2': s2})
To get the histogram with subplots as desired:
d.apply(pd.value_counts).plot(kind='bar', subplots=True)
The OP mentioned pd.value_counts
in the question. I think the missing piece is just that there is no reason to "manually" create the desired bar plot.
The output from d.apply(pd.value_counts)
is a pandas dataframe. We can plot the values like any other dataframe, and selecting the option subplots=True
gives us what we want.
You can use pd.value_counts
(value_counts is also a series method):
In [20]: d.apply(pd.value_counts)
Out[20]:
s1 s2
a 3 3
b 2 NaN
c 1 NaN
d NaN 1
f NaN 3
and than plot the resulting DataFrame.
I would shove the Series into a collections.Counter
(documentation) (You might need to convert it to a list first). I am not a pandas
expert, but I think you should be able to fold the Counter
object back into a Series
, indexed by the strings, and use that to make your plots.
This is not working because it is (rightly) raising errors when it tries to guess where the bin edges should be, which simply makes no sense with strings.