Simple lookup to insert values in an R data frame

后端未结

关注

 4  660

This is a seemingly simple R question, but I don\'t see an exact answer here. I have a data frame (alldata) that looks like this:

Case     zip     market
1


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  挽巷        
                
              
                            
                2021-02-07 16:47
              
            
            
                                                                       
Here's the dplyr way of doing it:

library(tidyverse)
alldata %>%
  select(-market) %>%
  left_join(zipcodes, by="zip")


which, on my machine, is roughly the same performance as lookup.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  独厮守ぢ        
                
              
                            
                2021-02-07 16:57
              
            
            
                                                                       
Since you don't care about the market column in alldata, you can first strip it off using and merge the columns in alldata and zipcodes based on the zip column using merge:

merge(alldata[, c("Case", "zip")], zipcodes, by="zip")


The by parameter specifies the key criteria, so if you have a compound key, you could do something like by=c("zip", "otherfield").
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  迷失自我        
                
              
                            
                2021-02-07 16:57
              
            
            
                                                                       
With such a large data set you may want the speed of an environment lookup.  You can use the lookup function from the qdapTools package as follows:

library(qdapTools)
alldata$market <- lookup(alldata$zip, zipcodes[, 2:1])


Or

alldata$zip %l% zipcodes[, 2:1]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2021-02-07 17:03
              
            
            
                                                                       
Another option that worked for me and  is very simple:

alldata$market<-with(zipcodes, market[match(alldata$zip, zip)])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复