I would like find the fastes way in R to indentify indexes of elements in Ytimes array which are closest to given Xtimes values.
So far I have been using a simple for-lo
We can use findInterval
to do this efficiently. (cut
will also work, with a little more work).
First, let's offset the Ytimes
offsets so that we can find the nearest and not the next-lesser. I'll demonstrate on fake data first:
y <- c(1,3,5,10,20)
y2 <- c(-Inf, y + c(diff(y)/2, Inf))
cbind(y, y2[-1])
# y
# [1,] 1 2.0
# [2,] 3 4.0
# [3,] 5 7.5
# [4,] 10 15.0
# [5,] 20 Inf
findInterval(c(1, 1.9, 2.1, 8), y2)
# [1] 1 1 2 4
The second column (prepended with a -Inf
will give us the breaks. Notice that each is half-way between the corresponding value and its follower.
Okay, let's apply this to your vectors:
Y2 <- Ytimes + c(diff(Ytimes)/2, Inf)
head(cbind(Ytimes, Y2))
# Ytimes Y2
# [1,] 0.0000000 0.06006006
# [2,] 0.1201201 0.18018018
# [3,] 0.2402402 0.30030030
# [4,] 0.3603604 0.42042042
# [5,] 0.4804805 0.54054054
# [6,] 0.6006006 0.66066066
Y2 <- c(-Inf, Ytimes + c(diff(Ytimes)/2, Inf))
cbind(Xtimes, Y2[ findInterval(Xtimes, Y2) ])
# Xtimes
# [1,] 1 0.9009009
# [2,] 5 4.9849850
# [3,] 8 7.9879880
# [4,] 10 9.9099099
# [5,] 15 14.9549550
# [6,] 19 18.9189189
# [7,] 23 22.8828829
# [8,] 34 33.9339339
# [9,] 45 44.9849850
# [10,] 51 50.9909910
# [11,] 55 54.9549550
# [12,] 57 56.9969970
# [13,] 78 77.8978979
# [14,] 120 119.9399399
(I'm using cbind
just for side-by-side demonstration, not that it's necessary.)
Benchmark:
mbm <- microbenchmark::microbenchmark(
for_loop = {
YmatchIndex <- array(0,length(Xtimes))
for (i in 1:length(Xtimes)) {
YmatchIndex[i] = which.min(abs(Ytimes - Xtimes[i]))
}
},
apply = sapply(Xtimes, function(x){which.min(abs(Ytimes - x))}),
fndIntvl = {
Y2 <- c(-Inf, Ytimes + c(diff(Ytimes)/2, Inf))
Ytimes[ findInterval(Xtimes, Y2) ]
},
times = 100
)
mbm
# Unit: microseconds
# expr min lq mean median uq max neval
# for_loop 2210.5 2346.8 2823.678 2444.80 3029.45 7800.7 100
# apply 48.8 58.7 100.455 65.55 91.50 2568.7 100
# fndIntvl 18.3 23.4 34.059 29.80 40.30 83.4 100
ggplot2::autoplot(mbm)