converting output of R's “by” command to data frame

前端未结

关注

 3  1564

I\'m trying to use R\'s by command to get column means for subsets of a data frame. For example, consider this data frame:

> z = data.frame(


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  长情又很酷        
                
              
                            
                2021-01-18 06:25
              
            
            
                                                                       
Dealing with the by output can be really annoying. I just found a way to withdraw what you want in a format of a data frame and you won't need extra packages.

So, if you do this:

aux <- by(z[,2:5],z$labels,colMeans)


You can then transform it in a data frame by doing this:

  aux_df <- as.data.frame(t(aux[seq(nrow(aux)),seq(ncol(aux))]))


I'm just getting all the rows and columns from aux, transposing it and using as.data.frame.

I hope that helps.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2021-01-18 06:29
              
            
            
                                                                       
You can use ddply from plyr package

library(plyr)
ddply(z, .(labels), numcolwise(mean))
  labels data.1 data.2 data.3 data.4
1      a    1.5    6.5   11.5   16.5
2      b    3.0    8.0   13.0   18.0
3      c    4.5    9.5   14.5   19.5


Or aggregate from stats

aggregate(z[,-1], by=list(z$labels), mean)
  Group.1 data.1 data.2 data.3 data.4
1       a    1.5    6.5   11.5   16.5
2       b    3.0    8.0   13.0   18.0
3       c    4.5    9.5   14.5   19.5


Or dcast from reshape2 package

library(reshape2)
dcast( melt(z), labels ~ variable, mean)


Using sapply :

 t(sapply(split(z[,-1], z$labels), colMeans))
  data.1 data.2 data.3 data.4
a    1.5    6.5   11.5   16.5
b    3.0    8.0   13.0   18.0
c    4.5    9.5   14.5   19.5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  予麋鹿        
                
              
                            
                2021-01-18 06:39
              
            
            
                                                                       
The output of by is a list so you can use do.call to rbind them and then convert this:

as.data.frame(do.call("rbind",by(z[,2:5],z$labels,colMeans)))
  data.1 data.2 data.3 data.4
a    1.5    6.5   11.5   16.5
b    3.0    8.0   13.0   18.0
c    4.5    9.5   14.5   19.5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复