I am currently dealing with the following data structures:
Attributes df:
ID Begin_A End_A Interval Value
1 5 199
You may consider data.table
which allows for "non-equi joins", i.e. joins based on >=
, >
, <=
and <
. In the same call, aggregate operations may be performed on the groups in the LHS data set that each row in the RHS data set (i
) matches (by = .EACHI
).
d1[d2, on = .(id = id, end >= begin),
.(i.begin, i.end, val_str = toString(val)), by = .EACHI]
# id end i.begin i.end val_str
# 1: 5 2017-03-03 2017-03-03 2017-03-05 Cat3, Cat1
# 2: 6 2017-05-03 2017-05-03 2017-05-05 NA
# 3: 8 2017-03-03 2017-03-03 2017-03-05 NA
# 4: 10 2017-12-05 2017-12-05 2017-12-06 Cat4
Data preparation:
d1 <- data.frame(id = c(5, 10, 5, 10),
begin = as.Date(c('1990-3-1','1993-12-1','1991-3-1','1995-12-5')),
end = as.Date(c('2017-3-10','2017-12-2','2017-3-3','2017-12-10')),
val = c("Cat1", "Cat2", "Cat3", "Cat4"))
d2 <- data.frame(id = c(5, 6, 8, 10),
begin = as.Date(c('2017-3-3','2017-5-3','2017-3-3','2017-12-5')),
end = as.Date(c('2017-3-5','2017-5-5','2017-3-5','2017-12-6')))
library(data.table)
setDT(d1)
setDT(d2)