Pandas: add crosstab totals

后端未结

关注

 3  1047

How can I add to my crosstab an additional row and an additional column for the totals?

df = pd.DataFrame({\"A\": np.random.randint(0,2,100), \"B\" : np.rand


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  自闭症患者        
                
              
                            
                2020-12-20 21:51
              
            
            
                                                                       
In fact pandas.crosstab already provides an option margins, which does exactly what you want. 

> df = pd.DataFrame({"A": np.random.randint(0,2,100), "B" : np.random.randint(0,2,100)})
> pd.crosstab(df.A, df.B, margins=True)
B     0   1  All
A               
0    26  21   47
1    25  28   53
All  51  49  100


Basically, by setting margins=True, the resulting frequency table will add an "All" column and an "All" row that compute the subtotals.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-20 21:56
              
            
            
                                                                       
This is because 'attribute-like' column access does not work with integer column names. Using the standard indexing:

In [122]: ct["Total"] = ct[0] + ct[1]

In [123]: ct
Out[123]:
B   0   1  Total
A
0  26  24     50
1  30  20     50


See the warnings at the end of this section in the docs: http://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

When you want to work with the rows, you can use .loc:

In [126]: ct.loc["Total"] = ct.loc[0] + ct.loc[1]


In this case ct.loc["Total"] is equivalent to ct.loc["Total", :]
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2020-12-20 21:56
              
            
            
                                                                       
You should use the margins=True for this along with crosstab. That should do the job!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复