Group values with identical ID into columns without summerizing them in R

后端未结

关注

 2  682

I have a dataframe that looks like this, but with a lot more Proteins

Protein      z
  Irak4  -2.46
  Irak4  -0.13
    Itk  -0.49
    Itk   4.22
    Itk  -0.51


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  后悔当初        
                
              
                            
                2021-01-21 20:43
              
            
            
                                                                       
library(data.table)

dcast(setDT(df),rowid(Protein)~Protein,value.var='z')

   Protein Irak4   Itk  Ras
1:       1 -2.46 -0.49 1.53
2:       2 -0.13  4.22   NA
3:       3    NA -0.51   NA


in base R you can do:

data.frame(sapply(a<-unstack(df,z~Protein),`length<-`,max(lengths(a))))
  Irak4   Itk  Ras
1 -2.46 -0.49 1.53
2 -0.13  4.22   NA
3    NA -0.51   NA


Or using reshape:

reshape(transform(df,gr=ave(z,Protein,FUN=seq_along)),v.names = 'z',timevar = 'Protein',idvar = 'gr',dir='wide') 
  gr z.Irak4 z.Itk z.Ras
1  1   -2.46 -0.49  1.53
2  2   -0.13  4.22    NA
5  3      NA -0.51    NA

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  我寻月下人不归        
                
              
                            
                2021-01-21 21:06
              
            
            
                                                                       
Here is an option with tidyverse

library(tidyverse)
DF %>% 
  group_by(Protein) %>% 
  mutate(idx = row_number()) %>% 
  spread(Protein, z) %>% 
  select(-idx)
# A tibble: 3 x 3
#   Irak4   Itk   Ras
#   <dbl> <dbl> <dbl>
#1  -2.46 -0.49  1.53
#2  -0.13  4.22 NA   
#3  NA    -0.51 NA 


Before we spread the data, we need to create unique identifiers.



In base R you could use unstack first which will give you a named list of vectors that contain the values in the z column. 

Use lapply to iterate over that list and append the vectors with NAs using the `length<-` function in order to have a list of vectors with equal lengths. Then we can call data.frame.

lst <- unstack(DF, z ~ Protein)
data.frame(lapply(lst, `length<-`, max(lengths(lst))))
#  Irak4   Itk  Ras
#1 -2.46 -0.49 1.53
#2 -0.13  4.22   NA
#3    NA -0.51   NA


data

DF <- structure(list(Protein = c("Irak4", "Irak4", "Itk", "Itk", "Itk", 
"Ras"), z = c(-2.46, -0.13, -0.49, 4.22, -0.51, 1.53)), .Names = c("Protein", 
"z"), class = "data.frame", row.names = c(NA, -6L))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复