Citing the E2LSH manual (it\'s not important that\'s about this specific library, this quote should be true for NN problem in general):
I will provide an example, which should clear things up. Assume that our dataset consists of only one point, p
and a query point arrives, q
. Let's assume* that the distance of p
and q
is 3,9.
Now, by using E2LSH$, I can create a data structure that solves the R-nearest neighbor problem, i.e. it will answer yes (and fetch me the point) that lies inside a radius R. If no such point exists, it will answer no.
Let's say that I choose to build 5 data structures of that kind, starting from R = 1 to 5. In our mind, this is what we have done so far:
So now remember, that d(p, q) = 3,9, thus we expect to ask the data structure that is built with R = 4 and find for us the query point q
.
Now let's pretend that we do not know d(p, q), so we start searching from the smallest Radius we have picked, that's 1. So, we ask, is there anything in Radius (equal to 1) from our dataset? No!
From R = 2? No!
From R = 3? No!
From R = 4? Yes, and that's q
! So now we are done. 4 is the Rt you mention in your question.
* That's a strong assumption and E2LSH suffers for having to put the user to input that parameter R, since usually we do not know what value should R have, too big and we will waste space and time, too small and we won't find our query!
$ I heard there is something more modern than E2LSH now, in Ilya Razenshteyn's homepage.