Matching values between data frames based on overlapping dates

前端 未结 1 995
[愿得一人]
[愿得一人] 2021-01-23 11:43

I am currently dealing with the following data structures:

Attributes df:

  ID Begin_A      End_A        Interval                          Value
1  5 199         


        
相关标签:
1条回答
  • 2021-01-23 12:21

    You may consider data.table which allows for "non-equi joins", i.e. joins based on >=, >, <= and <. In the same call, aggregate operations may be performed on the groups in the LHS data set that each row in the RHS data set (i) matches (by = .EACHI).

    d1[d2, on = .(id = id, end >= begin),
             .(i.begin, i.end, val_str = toString(val)), by = .EACHI]
    
    #    id        end    i.begin      i.end    val_str
    # 1:  5 2017-03-03 2017-03-03 2017-03-05 Cat3, Cat1
    # 2:  6 2017-05-03 2017-05-03 2017-05-05         NA
    # 3:  8 2017-03-03 2017-03-03 2017-03-05         NA
    # 4: 10 2017-12-05 2017-12-05 2017-12-06       Cat4
    

    Data preparation:

    d1 <- data.frame(id = c(5, 10, 5, 10),
                     begin = as.Date(c('1990-3-1','1993-12-1','1991-3-1','1995-12-5')),
                     end = as.Date(c('2017-3-10','2017-12-2','2017-3-3','2017-12-10')),
                     val = c("Cat1", "Cat2", "Cat3", "Cat4"))
    
    d2 <- data.frame(id = c(5, 6, 8, 10),
                     begin = as.Date(c('2017-3-3','2017-5-3','2017-3-3','2017-12-5')),
                     end = as.Date(c('2017-3-5','2017-5-5','2017-3-5','2017-12-6')))
    
    library(data.table)
    setDT(d1)
    setDT(d2)
    
    0 讨论(0)
提交回复
热议问题