Converting long to wide format

后端未结

关注

 3  1367

id <- c(1:8,1:8)
age1 <- c(7.5,6.7,8.6,9.5,8.7,6.3,9,5)
age2 <- age1 + round(runif(1,1,3),1)
age <- c(age1, age2)

tanner <-  sample(1:2, 16,replace=T)

d


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  走了就别回头了        
                
              
                            
                2021-01-24 02:43
              
            
            
                                                                       
We can use dcast to convert from 'long' to 'wide' and use the fun.aggregate as min.  Here I converted the 'data.frame' to 'data.table' (setDT(df)) as the dcast from data.table would be fast.

library(data.table)
res <- dcast(setDT(df), id~paste('age',tanner,sep='.'), value.var='age', min)
res
#   id age.1 age.2
#1:  1  10.0   7.5
#2:  2   6.7   Inf
#3:  3  11.1   8.6
#4:  4   Inf   9.5
#5:  5   8.7  11.2
#6:  6   6.3   8.8
#7:  7   9.0   Inf
#8:  8   5.0   Inf


If we want to change the 'Inf' to 'NA'

res[,(2:3) := lapply(.SD, function(x)
          replace(x, is.infinite(x), NA)),.SDcols= 2:3]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2021-01-24 02:44
              
            
            
                                                                       
A little dplyr and tidyr does the trick here.  arrange by age so lowest age appear first then use a filter for duplicated id/tanner then utilize tidyr::spread

df<-
data.frame(
  id = c(1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8)
  ,age = c(7.5, 6.7, 8.6, 9.5, 8.7, 6.3, 9.0, 5.0,10.0, 9.2,11.1,12.0,11.2, 8.8,11.5, 7.5)
  ,tanner = c(2,1,2,2,1,1,1,1,1,1,1,2,2,2,1,1)
)

library(dplyr)
library(tidyr)

wide <- 
df %>%
  arrange(age) %>%
  filter(!duplicated(paste(id, tanner))) %>%
  spread(tanner, age)

colnames(wide) = c('id', 'tanner1', 'tanner2')
wide

#   id    1    2
#    1 10.0  7.5
#    2  6.7   NA
#    3 11.1  8.6
#    4   NA  9.5
#    5  8.7 11.2
#    6  6.3  8.8
#    7  9.0   NA
#    8  5.0   NA

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  生来不讨喜        
                
              
                            
                2021-01-24 02:58
              
            
            
                                                                       
aggregate then reshape (using a copied and pasted version of your df rather than your code, that doesn't match):

reshape(
  aggregate(age ~ ., data=df, FUN=min),
  idvar="id", timevar="tanner", direction="wide"
)

#   id age.1 age.2
#1   1  10.0   7.5
#2   2   6.7    NA
#3   3  11.1   8.6
#4   5   8.7  11.2
#5   6   6.3   8.8
#6   7   9.0    NA
#7   8   5.0    NA
#10  4    NA   9.5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复