I have two data frames ev1 and ev2, describing timestamps of two types of events collected over many tests. So, each data frame has columns \"test_id\", and \"timestamp\". W
May be this helps:
library(data.table)
setkey(setDT(ev1), test_id)
DT <- ev1[ev2, allow.cartesian=TRUE][,distance:=time-i.time]
DT[DT[,abs(distance)==min(abs(distance)), by=list(test_id, i.time)]$V1]
# test_id time i.time distance
#1: 0 3 6 3
#2: 0 1 1 0
#3: 0 3 8 5
#4: 1 4 4 0
#5: 1 4 5 1
#6: 1 4 11 7
Or
ev1[ev2, allow.cartesian=TRUE][,distance:= time-i.time][,
.SD[abs(distance)==min(abs(distance))], by=list(test_id, i.time)]
Using the new grouping
setkey(setDT(ev1), test_id, group_id)
setkey(setDT(ev2), test_id, group_id)
DT <- ev1[ev2, allow.cartesian=TRUE][,distance:=i.time-time]
DT[DT[,abs(distance)==min(abs(distance)), by=list(test_id,
group_id,i.time)]$V1]$distance
#[1] 2 3 4 -1 0 4
Based on the code you provided
min_data$distance
#[1] 2 3 4 -1 0 4
Here's how I'd do it using data.table
:
require(data.table)
setkey(setDT(ev1), test_id)
ev1[ev2, .(ev2.time = i.time, ev1.time = time[which.min(abs(i.time - time))]), by = .EACHI]
# test_id ev2.time ev1.time
# 1: 0 6 3
# 2: 0 1 1
# 3: 0 8 3
# 4: 1 4 4
# 5: 1 5 4
# 6: 1 11 4
In joins of the form x[i]
in data.table
, the prefix i.
is used to refer the columns in i
, when both x
and i
share the same name for a particular column.
Please see this SO post for an explanation on how this works.
This is syntactically more straightforward to understand what's going on, and is memory efficient (at the expense of little speed1) as it doesn't materialise the entire join result at all. In fact, this does exactly what you say in your post - filter on the fly, while merging.
i
, it might be a tad slower as the j
-expression will have to be evaluated for each row in i
. In contrast, @akrun's answer does a cartesian join followed by one filtering. So while it's high on memory, it doesn't evaluate j
for each row in i
. But again, this shouldn't even matter unless you work with really large i
which is not often the case.HTH