Replace values in data frame based on other data frame in R

后端 未结 4 2223
说谎
说谎 2020-12-06 05:21

In the below example, userids is my reference data frame and userdata is the data frame where the replacements should take place.

&         


        
相关标签:
4条回答
  • 2020-12-06 05:42

    Here a try using sqldf to get the result as a multiple join on differents columns.

      library(sqldf)
      sqldf('SELECT d.INFO,d.AGE,i1.ID ,i2.ID FRIENDID
           FROM 
           userdata d
           INNER JOIN 
           userids i1 ON (i1.USER=d.FRIENDID)
           INNER JOIN
            userids i2 ON (i2.USER=d.ID)')
    
     INFO AGE ID FRIENDID
    1  foo  43  1        4
    2  foo  53  3        1
    3  bar  26  2        3
    

    But this this removes NA lines! maybe someone can suggest me something on how to deal with NA!

    EDIT

    Thanks to G. Grothendieck comment, replacing the INNER by LEFT we get the result.

     sqldf('SELECT d.INFO,d.AGE,i1.ID ,i2.ID FRIENDID
            FROM 
            userdata d
            LEFT JOIN 
            userids i1 ON (i1.USER=d.FRIENDID)
            LEFT JOIN
             userids i2 ON (i2.USER=d.ID)')
    INFO AGE ID FRIENDID
    1  foo  43  1        4
    2  bar  33 NA        2
    3  foo  53  3        1
    4  bar  26  2        3
    
    0 讨论(0)
  • 2020-12-06 05:44

    Use match:

    userdata$ID <- userids$ID[match(userdata$ID, userids$USER)]
    userdata$FRIENDID <- userids$ID[match(userdata$FRIENDID, userids$USER)]
    
    0 讨论(0)
  • 2020-12-06 05:50

    This is a possibility:

    library(qdap)
    userdata$FRIENDID <- lookup(userdata$FRIENDID, userids)
    userdata$ID <- lookup(userdata$ID, userids)
    

    or to win the one line prize:

    userdata[, c(2, 4)] <- lapply(userdata[, c(2, 4)], lookup, key.match=userids)
    
    0 讨论(0)
  • 2020-12-06 05:50

    Here's a possible solution, which will also work on datasets with multiple records of each ID, though we will need to coerce the ID and FRIENDID variables to character first:

    > userdata$ID <- sapply(userdata$ID, function(x){gsub(x, userids[userids$USER==x, 2], x)})
    > userdata$FRIENDID <- sapply(userdata$FRIENDID, function(x){gsub(x, userids[userids$USER==x, 2], x)})
    
    0 讨论(0)
提交回复
热议问题