Missing data in pandas.crosstab

前端 未结 2 1909
太阳男子
太阳男子 2021-02-03 13:35

I\'m making some crosstabs with pandas:

a = np.array([\'foo\', \'foo\', \'foo\', \'bar\', \'bar\', \'foo\', \'foo\'], dtype=object)
b = np.array([\'one\', \'one\         


        
2条回答
  •  野趣味
    野趣味 (楼主)
    2021-02-03 14:02

    I don't think there is a way to do this, and crosstab calls pivot_table in the source, which doesn't seem to offer this either. I raised it as an issue here.

    A hacky workaround (which may or may not be the same as you were already using...):

    from itertools import product
    ct = pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])
    a_x_b = list(product(np.unique(b), np.unique(c)))
    a_x_b = pd.MultiIndex.from_tuples(a_x_b)
    
    In [15]: ct.reindex_axis(a_x_b, axis=1).fillna(0)
    Out[15]:
          one          two
         dull  shiny  dull  shiny
    a
    bar     1      0     1      0
    foo     2      0     1      2
    

    If product is too slow, here is a numpy implementation of it.

提交回复
热议问题