R dplyr join on range of dates

后端 未结 2 1386
抹茶落季
抹茶落季 2021-01-15 15:25

I want to join two tables xxx and yyy using a composite unique key and date ranges. In sql I would simply specify in the join but I cannot get

相关标签:
2条回答
  • 2021-01-15 15:41

    First of all, thank you for trying to help me. I realize my question is incomplete. I moved away from fuzzyjoin because of all the bioconductor dependencies.

    I used sqldf instead to accomplish the task:

    library(sqldf)
    sqldf("SELECT * FROM xxx
                LEFT JOIN yyy
                ON  xxx.ID  = yyy.ID
                AND xxx.NRA = yyy.NRA
                AND yyy.date BETWEEN xxx.date_low AND xxx.date_high")
    

    The result is almost identical to this question but I suspect it can also be solved with that question as per Uwe's data.table solution.

    I am also linking this rstudio response

    0 讨论(0)
  • 2021-01-15 15:44

    We could use fuzzy_inner_join from fuzzy_join

    library(fuzzy_join)
    fuzzy_inner_join(xxx, yyy,
                  by = c("ID" = "ID",
                               "NRA"="NRA",
                               "date_low" =  "date",
                               "date_high" = "date"), 
                  match_fun = list("==", "==", ">", "<"))
    
    0 讨论(0)
提交回复
热议问题