I have two tables. table1
looks like this
date hour data
2010-05-01 3 5
2010-05-02 7 7
2010-05-02 10
Here's a way of achieving what you want. This assumes your table1
's time precision is 1 hour. Though it can be modified to an arbitrary precision, it will perform much better for larger time intervals as it constructs the full sequence of possible times in the date_out
-date_back
range. Note, I used slightly different tables from OP to illustrate overlapping intervals and to correct some mistakes in OP.
table1 = data.table(date = c("2010-05-01", "2010-05-02", "2010-05-02", "2010-07-03", "2011-12-09", "2012-05-01"), hour = c(3,7,10,18,22,3), data = c(5,7,8,3,1,0))
outages = data.table(resource = c("joey", "bob", "billy", "bob", "joey"), date_out = c("2010-04-30 4:00:00", "2010-04-30 4:00:00", "2009-04-20 7:00:00", "2011-11-15 12:20:00", "2012-04-28 1:00:00"), date_back=c("2010-05-02 8:30:00", "2010-05-02 8:30:00", "2009-06-02 5:30:00", "2011-12-09 23:00:00", "2012-05-02 17:00:00"))
# round up date_out and round down date_back
# and create a sequence in-between spaced by 1 hour
outages[, list(datetime = seq(as.POSIXct(round(as.POSIXct(date_out) + 30*60-1, "hours")),
as.POSIXct(round(as.POSIXct(date_back) - 30*60, "hours")),
60*60)),
by = list(resource, date_out)] -> outages.expanded
setkey(outages.expanded, datetime)
# merge with the original table, then run "table" to get the frequencies/occurences
# and cbind back with the original table
cbind(table1, unclass(table(
outages.expanded[table1[, list(datetime=as.POSIXct(paste0(date, " ", hour, ":00:00")))],
resource])))
# date hour data bob joey
#1: 2010-05-01 3 5 1 1
#2: 2010-05-02 7 7 1 1
#3: 2010-05-02 10 8 0 0
#4: 2010-07-03 18 3 0 0
#5: 2011-12-09 22 1 1 0
#6: 2012-05-01 3 0 0 1