Hi I have two tables (table1 and table2 below) and would like to join them based on the closest timestamp to form expected_output. Some kind of solution involving dplyr wou
This is liable to be slow, but...
d <- function(x,y) abs(x-y) # define the distance function
idx <- sapply( table1$date, function(x) which.min( d(x,table2$date) )) # find matches
cbind(table1,table2[idx,-1,drop=FALSE])
# date val1 val2
# 2 2015-07-16 08:55:00 94 90
# 3 2015-06-20 00:15:00 33 18
# 1 2015-05-12 15:00:00 53 67
Another way of constructing idx
is max.col(-outer(table1$date, table2$date, d))
.
Using rolling joins feature of data.table
with roll = "nearest"
:
require(data.table) # v1.9.6+
setDT(table1)[, val2 := setDT(table2)[table1, val2, on = "date", roll = "nearest"]]
Here, val2
column is created by performing a join on the column date
with roll = "nearest"
option. For each row of table1$date
, the closest matching row from table2$date
is computed, and val2
for corresponding row is extracted.