I\'m making some crosstabs with pandas:
a = np.array([\'foo\', \'foo\', \'foo\', \'bar\', \'bar\', \'foo\', \'foo\'], dtype=object)
b = np.array([\'one\', \'one\
The crosstab function has a parameter called dropna which is set to True by default. This parameter defines whether empty columns (such as the one-shiny column) should be displayed or not.
I tried calling the funcion like this:
pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'], dropna = False)
and this is what I got:
b one two
c dull shiny dull shiny
a
bar 1 0 1 0
foo 2 0 1 2
Hope that was still helpful.
I don't think there is a way to do this, and crosstab
calls pivot_table
in the source, which doesn't seem to offer this either. I raised it as an issue here.
A hacky workaround (which may or may not be the same as you were already using...):
from itertools import product
ct = pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])
a_x_b = list(product(np.unique(b), np.unique(c)))
a_x_b = pd.MultiIndex.from_tuples(a_x_b)
In [15]: ct.reindex_axis(a_x_b, axis=1).fillna(0)
Out[15]:
one two
dull shiny dull shiny
a
bar 1 0 1 0
foo 2 0 1 2
If product
is too slow, here is a numpy implementation of it.