nearest-neighbor

Strategy for isolating 3d data points

烂漫一生 提交于 2019-12-23 17:43:50
问题 I have two sets of points, one from an analysis and another that I will use for the results of post-processing on the analysis data. The analysis data, in black, is scattered. The points used for results are red. Here are the two sets on the same plot: The problem I have is this: I will be interpolating onto the red points, but as you can see there are red points which fall inside areas of the black data set that are in voids. Interpolation causes there to be non-zero values at those points

Search for all nearest neighbors within a certain radius of a point in 3D?

天涯浪子 提交于 2019-12-23 03:09:49
问题 I have about 80 million spatial points(3D) and I want to find all the nearest neighbors of a query point which lie under a sphere of a certain radius(can be given as input) with the query point as center. I have read about some data structures that are used for such kind of search, such as Kd-trees, octrees or range trees. For my application, I only need to populate the data structure once and then search for multiple query points. My question is: Is there any better way or a better data

count the number of adjacent rectangles

喜夏-厌秋 提交于 2019-12-23 02:56:19
问题 My code prints a sets of (X,Y) coordinates in 2D space in the range [0,1]. void Rect_Print() { cout << "In counter-clockwise fashion" << endl; cout << "#Rectangle ( x0, y0) ( x1, y1) " << endl; for (int b=0; b<Rect_Count; b++) { double Area = (Rect[b].x0 - Rect[b].x1) * (Rect[b].y0 - Rect[b].y1); cout << fixed << setprecision(4) << (b+1) << " (" << Rect[b].x0 << "," << Rect[b].y0 << ") (" << Rect[b].x1 << "," << Rect[b].y1 << ")" << endl; } cout << "Number of divisions (N = 3j-2) = " << Rect

Clustering problem

蹲街弑〆低调 提交于 2019-12-22 08:13:32
问题 I've been tasked to find N clusters containing the most points for a certain data set given that the clusters are bounded by a certain size. Currently, I am attempting to do this by plugging in my data into a kd-tree, iterating over the data and finding its nearest neighbor, and then merging the points if the cluster they make does not exceed a limit. I'm not sure this approach will give me a global solution so I'm looking for ways to tweak it. If you can tell me what type of problem this

How does space partitioning algorithm for finding nearest neighbors work?

烈酒焚心 提交于 2019-12-22 07:44:11
问题 For finding the nearest neighbor, Space Partitioning is one of the algorithms. How does it work? Suppose I have a 2D set of points (x and y coordinates), and I am given a point (a,b). How would this algorithm find out the nearest neighbor? 回答1: Spacial partitioning is actually a family of closely related algorithms that partition space so that applications can process the points or polygons easier. I reckon there are many ways to solve your problem. I don't know how complex you are willing to

Postgresql k-nearest neighbor (KNN) on multidimensional cube

瘦欲@ 提交于 2019-12-22 06:06:11
问题 I have a cube that has 8 dimensions. I want to do nearest neighbor matching. I'm totally new to postgresql. I read that 9.1 supports nearest neighbor matching on multidimensions. I'd really appreciate if someone could give a complete example: How to create a table with the 8D cube ? Sample Insert Lookup - exact matching Lookup - nearest neighbor matching Sample Data: For simplicity sake, we can assume that all the values range from 0-100. Point1: (1,1,1,1, 1,1,1,1) Point2: (2,2,2,2, 2,2,2,2)

Why does scikit-learn's Nearest Neighbor doesn't seem to return proper cosine similarity distances?

匆匆过客 提交于 2019-12-21 16:48:36
问题 I am trying to use scikit's Nearest Neighbor implementation to find the closest column vectors to a given column vector, out of a matrix of random values. This code is supposed to find the nearest neighbors of column 21 then check the actual cosine similarity of those neighbors against column 21. from sklearn.neighbors import NearestNeighbors import sklearn.metrics.pairwise as smp import numpy as np test=np.random.randint(0,5,(50,50)) nbrs = NearestNeighbors(n_neighbors=5, algorithm='auto',

How to bucket locality-sensitive hashes?

两盒软妹~` 提交于 2019-12-21 15:08:09
问题 I already have the algorithm to produce locality-sensitive hashes, but how should I bucket them to take advantage of their characteristics(i.e. similar elements have near hashes(with the hamming distance))? In the matlab code I found they simply create a distance matrix between the hashes of the points to search and the hashes of the points in the database, to simplify the code,while referencing a so called Charikar method for an actually good implementation of the search method. I tried to

kNN with big sparse matrices in Python

给你一囗甜甜゛ 提交于 2019-12-21 13:42:09
问题 I have two large sparse matrices: In [3]: trainX Out[3]: <6034195x755258 sparse matrix of type '<type 'numpy.float64'>' with 286674296 stored elements in Compressed Sparse Row format> In [4]: testX Out[4]: <2013337x755258 sparse matrix of type '<type 'numpy.float64'>' with 95423596 stored elements in Compressed Sparse Row format> About 5 GB RAM in total to load. Note these matrices are HIGHLY sparse (0.0062% occupied). For each row in testX , I want to find the Nearest Neighbor in trainX and

kNN with big sparse matrices in Python

偶尔善良 提交于 2019-12-21 13:41:15
问题 I have two large sparse matrices: In [3]: trainX Out[3]: <6034195x755258 sparse matrix of type '<type 'numpy.float64'>' with 286674296 stored elements in Compressed Sparse Row format> In [4]: testX Out[4]: <2013337x755258 sparse matrix of type '<type 'numpy.float64'>' with 95423596 stored elements in Compressed Sparse Row format> About 5 GB RAM in total to load. Note these matrices are HIGHLY sparse (0.0062% occupied). For each row in testX , I want to find the Nearest Neighbor in trainX and