Pandas: add crosstab totals

后端 未结 3 1047
有刺的猬
有刺的猬 2020-12-20 20:53

How can I add to my crosstab an additional row and an additional column for the totals?

df = pd.DataFrame({\"A\": np.random.randint(0,2,100), \"B\" : np.rand         


        
相关标签:
3条回答
  • 2020-12-20 21:51

    In fact pandas.crosstab already provides an option margins, which does exactly what you want.

    > df = pd.DataFrame({"A": np.random.randint(0,2,100), "B" : np.random.randint(0,2,100)})
    > pd.crosstab(df.A, df.B, margins=True)
    B     0   1  All
    A               
    0    26  21   47
    1    25  28   53
    All  51  49  100
    

    Basically, by setting margins=True, the resulting frequency table will add an "All" column and an "All" row that compute the subtotals.

    0 讨论(0)
  • 2020-12-20 21:56

    This is because 'attribute-like' column access does not work with integer column names. Using the standard indexing:

    In [122]: ct["Total"] = ct[0] + ct[1]
    
    In [123]: ct
    Out[123]:
    B   0   1  Total
    A
    0  26  24     50
    1  30  20     50
    

    See the warnings at the end of this section in the docs: http://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

    When you want to work with the rows, you can use .loc:

    In [126]: ct.loc["Total"] = ct.loc[0] + ct.loc[1]
    

    In this case ct.loc["Total"] is equivalent to ct.loc["Total", :]

    0 讨论(0)
  • 2020-12-20 21:56

    You should use the margins=True for this along with crosstab. That should do the job!

    0 讨论(0)
提交回复
热议问题