To get what you want from nn2
, you need to:
- Specify the
searchtype
to be "radius"
. By not specifying the searchtype
, the searchtype
is set to standard
and the radius
input is ignored.
- Specify
k
to be nrow(mydata)
because k
is the maximum number of nearest neighbors to compute.
The manner in which you are calling nn2
will return the 100
nearest neighbors in mydata
for each point in mydata
. Consequently, you will get the same result for any radius
. For example:
library(RANN)
set.seed(123)
## simulate some data
lon = runif(100, -99.1, -99)
lat = runif(100, 33.9, 34)
## data is a 100 x 2 matrix (can also be data.frame)
mydata <- cbind(lon, lat)
radius <- 0.02 ## your radius
res <- nn2(mydata, k=nrow(mydata), searchtype="radius", radius = radius)
## prints total number of nearest neighbors (for all points) found using "radius"
print(length(which(res$nn.idx>0)))
##[1] 1224
res1 <- nn2(mydata, k=100, radius = radius)
## prints total number of nearest neighbors (for all points) found using your call
print(length(which(res1$nn.idx>0)))
##[1] 10000
radius <- 0.03 ## increase radius
res <- nn2(mydata, k=nrow(mydata), searchtype="radius", radius = radius)
## prints total number of nearest neighbors (for all points) found using "radius"
print(length(which(res$nn.idx>0)))
##[1] 2366
res1 <- nn2(mydata, k=100, radius = radius)
## prints total number of nearest neighbors (for all points) found using your call
print(length(which(res1$nn.idx>0)))
##[1] 10000
Note that according to its documentation ?nn2
, nn2
returns a list with two elements:
nn.idx
: a nrow(query) x k
matrix where each row contains the row indices of the k
nearest neighbors in mydata
to the point at that row in the collection of query points in query
. In both of our calls, query=mydata
. When called with searchtype="radius"
, if there are m < k
neighbors within the given radius, then k - m
of those index values will be set to 0
. Since the set of query points is the same as mydata
, the index to the point itself will be included.
nn.dist
: a nrow(query) x k
matrix where each element contains the Euclidean distances for the corresponding nearest neighbor in nn.idx
. Here, if the corresponding element in nn.idx
is 0
, then the value in nn.dist
is set to 1.340781e+154.
With your call, you get 100
nearest neighbors for each point in mydata
, hence length(which(res1$nn.idx>0))==10000
no matter what the radius
is in the example.
Finally, you should note that because nn2
returns two nrow(mydata) x nrow(mydata)
in your case, it can very easily overwhelm your memory if you have a lot of points.
Updated to specifically produce the result of getting the count of neighbors within a given radius.
To compute the number of neighbors within a radius
of each point in the data, call nn2
as such
res <- nn2(mydata, k=nrow(mydata), searchtype="radius", radius = radius)
Then, do this:
count <- rowSums(res$nn.idx > 0) - 1
Notes:
- Since
query=mydata
and k=nrow(mydata)
the resulting res$nn.idx
will be nrow(mydata) x nrow(mydata)
as explained above.
- Each row
i
of res$nn.idx
corresponds to row i
in query
, which is the i
-th point in query=mydata
. Call this i
-th point p[i]
.
- Each row
i
of res$nn.idx
contains the row indices of the neighbors of p[i]
AND zeroes to fill that row in the matrix (because not all points in mydata
will be within the radius
of p[i]
).
- Therefore, the number of neighbors of
p[i]
can be found by finding those values in the row that are greater than 0
and then counting them. This is what rowSums(res$nn.idx > 0)
does for each row of res$nn.idx
.
- Lastly, because
query=mydata
, the point being queried is in the count itself. That is, a point is the nearest neighbor to itself. Therefore subtract 1
from these counts to exclude that.
The resulting count
will be a vector of the counts. The i
-th element is the number of neighbors within the radius
to the i
-th point in query=mydata
.
Hope this is clear.