Summarising by a group variable in r

后端未结

关注

 2  1897

I have a data frame as follows:

 head(newStormObject)
     FATALITIES   INJURIES    PROPVALDMG CROPVALDMG      EVTYPE     total
 1           0          15    2.5


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  不思量自难忘°        
                
              
                            
                2021-01-27 05:45
              
            
            
                                                                       
We can take the first value for all the other columns using slice after updating the 'total' with the sum of 'total'.

library(dplyr)
df1 %>% 
   group_by(EVTYPE) %>% 
   mutate(total = sum(total)) %>%
   slice(1L) %>%
   arrange(desc(total))
#      FATALITIES INJURIES PROPVALDMG CROPVALDMG    EVTYPE total
#       <int>    <int>      <dbl>      <int>     <chr> <int>
#1          0       15     250000          0   TORNADO    21
#2          0        0          0          0      HAIL    12
#3          0        0          0          0 TSTM WIND     1


NOTE: The 'total' for 'EVTYPE' "HAIL" is 12 based on the example
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  礼貌的吻别        
                
              
                            
                2021-01-27 06:02
              
            
            
                                                                       
Here is a base R solution that returns the same values (in a slightly different order)

merge(df[!duplicated(df$EVTYPE), -length(df)],
         aggregate(total ~ EVTYPE, data=df, sum), by="EVTYPE")
     EVTYPE FATALITIES INJURIES PROPVALDMG CROPVALDMG total
1      HAIL          0        0          0          0    12
2   TORNADO          0       15     250000          0    21
3 TSTM_WIND          0        0          0          0     1


duplicated is used to select the first observation of each EVTYPE level, aggregate is used to calculate the sum of the total variable. These results are merged on EVTYPE.

The rows are ordered by the order that factor automatically stores factor variables, that is alphabetically. The columns are slightly disordered from the desired output due to merge which puts the by variables in the front of the resulting data set. Fixing the columns is a matter of passing the names of the original data.frame.

merge(df[!duplicated(df$EVTYPE), -length(df)],
      aggregate(total ~ EVTYPE, data=df, sum), by="EVTYPE")[, names(df)]
  FATALITIES INJURIES PROPVALDMG CROPVALDMG    EVTYPE total
1          0        0          0          0      HAIL    12
2          0       15     250000          0   TORNADO    21
3          0        0          0          0 TSTM_WIND     1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复