How to find common rows between two dataframe in R?

前端未结

关注

 5  767

I would like to make a new data frame which only includes common rows of two separate data.frame. example:

data.frame 1

1 id300
2 id2345
3 id5456
4 i


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2020-12-30 02:37
              
            
            
                                                                       
To achieve this, you should assign the row names in both data frame and then process with intersect in R. This can be achieved with the following command:

intersect(dataframe.1$V1,dataframe.2$V2)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-30 02:48
              
            
            
                                                                       
The appropriate dplyr function here is inner_join (returns all rows from df x that have a match in df y.)

library(dplyr)
inner_join(df1, df2)

      V1
1  id300
2 id5456
3   id45


Note: the rows are returned in the order in which they are in df1. If you did inner_join(df2, df1), id45 would come before id5456.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2020-12-30 02:52
              
            
            
                                                                       
common <- intersect(data.frame1$col, data.frame2$col)  
data.frame1[common,] # give you common rows in data frame 1  
data.frame2[common,] # give you common rows in data frame 2

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光说笑        
                
              
                            
                2020-12-30 02:56
              
            
            
                                                                       
Use merge

new_data_frame <- merge(data.frame1, data.frame2)


I'm assuming you have only one column in each data frame and they have the same name in both frames. If not use the column you want to intersect by with by.x = "nameCol1" and by.y = "nameCol2", where nameCol are the real column names.


Added after first comment

If you have more columns in any data frame the command is the same. Do it this way:

>a  #Data frame 1
      c1 c2
1  id300  6
2 id2345  5
3 id5456  4
4   id33  3
5   id45  2
6   id54  1

> b #Data frame 2
     a  f
1  asd 12
2 id33 10
3 id45  8
4 id54  6


As you may see, they don't share column names and have 2 columns each. So:

> merge(a,b, by.x = "c1", by.y = "a")

    c1 c2  f
1 id33  3 10
2 id45  2  8
3 id54  1  6


The only rows that are left are those that have the same entries in common in the selected columns.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2020-12-30 02:59
              
            
            
                                                                       
We can also do this with fintersect from data.table after converting the data.frame to data.table

library(data.table)
fintersect(setDT(df1), setDT(df2))
#       v1
#1:  id300
#2:   id45
#3: id5456


data

df1 <- structure(list(v1 = c("id300", "id2345", "id5456", "id33", "id45", 
"id54")), .Names = "v1", class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

df2 <- structure(list(v1 = c("id832", "id300", "id1000", "id45", "id984", 
"id5456", "id888")), .Names = "v1", class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复