nearest-neighbor | 易学教程

Strategy for isolating 3d data points

阅读更多关于 Strategy for isolating 3d data points

问题 I have two sets of points, one from an analysis and another that I will use for the results of post-processing on the analysis data. The analysis data, in black, is scattered. The points used for results are red. Here are the two sets on the same plot: The problem I have is this: I will be interpolating onto the red points, but as you can see there are red points which fall inside areas of the black data set that are in voids. Interpolation causes there to be non-zero values at those points

Search for all nearest neighbors within a certain radius of a point in 3D?

阅读更多关于 Search for all nearest neighbors within a certain radius of a point in 3D?

问题 I have about 80 million spatial points(3D) and I want to find all the nearest neighbors of a query point which lie under a sphere of a certain radius(can be given as input) with the query point as center. I have read about some data structures that are used for such kind of search, such as Kd-trees, octrees or range trees. For my application, I only need to populate the data structure once and then search for multiple query points. My question is: Is there any better way or a better data

count the number of adjacent rectangles

阅读更多关于 count the number of adjacent rectangles

问题 My code prints a sets of (X,Y) coordinates in 2D space in the range [0,1]. void Rect_Print() { cout << "In counter-clockwise fashion" << endl; cout << "#Rectangle ( x0, y0) ( x1, y1) " << endl; for (int b=0; b<Rect_Count; b++) { double Area = (Rect[b].x0 - Rect[b].x1) * (Rect[b].y0 - Rect[b].y1); cout << fixed << setprecision(4) << (b+1) << " (" << Rect[b].x0 << "," << Rect[b].y0 << ") (" << Rect[b].x1 << "," << Rect[b].y1 << ")" << endl; } cout << "Number of divisions (N = 3j-2) = " << Rect

Clustering problem

阅读更多关于 Clustering problem

问题 I've been tasked to find N clusters containing the most points for a certain data set given that the clusters are bounded by a certain size. Currently, I am attempting to do this by plugging in my data into a kd-tree, iterating over the data and finding its nearest neighbor, and then merging the points if the cluster they make does not exceed a limit. I'm not sure this approach will give me a global solution so I'm looking for ways to tweak it. If you can tell me what type of problem this

How does space partitioning algorithm for finding nearest neighbors work?

阅读更多关于 How does space partitioning algorithm for finding nearest neighbors work?

问题 For finding the nearest neighbor, Space Partitioning is one of the algorithms. How does it work? Suppose I have a 2D set of points (x and y coordinates), and I am given a point (a,b). How would this algorithm find out the nearest neighbor? 回答1: Spacial partitioning is actually a family of closely related algorithms that partition space so that applications can process the points or polygons easier. I reckon there are many ways to solve your problem. I don't know how complex you are willing to

Postgresql k-nearest neighbor (KNN) on multidimensional cube

阅读更多关于 Postgresql k-nearest neighbor (KNN) on multidimensional cube

问题 I have a cube that has 8 dimensions. I want to do nearest neighbor matching. I'm totally new to postgresql. I read that 9.1 supports nearest neighbor matching on multidimensions. I'd really appreciate if someone could give a complete example: How to create a table with the 8D cube ? Sample Insert Lookup - exact matching Lookup - nearest neighbor matching Sample Data: For simplicity sake, we can assume that all the values range from 0-100. Point1: (1,1,1,1, 1,1,1,1) Point2: (2,2,2,2, 2,2,2,2)

Why does scikit-learn's Nearest Neighbor doesn't seem to return proper cosine similarity distances?

阅读更多关于 Why does scikit-learn's Nearest Neighbor doesn't seem to return proper cosine similarity distances?

问题 I am trying to use scikit's Nearest Neighbor implementation to find the closest column vectors to a given column vector, out of a matrix of random values. This code is supposed to find the nearest neighbors of column 21 then check the actual cosine similarity of those neighbors against column 21. from sklearn.neighbors import NearestNeighbors import sklearn.metrics.pairwise as smp import numpy as np test=np.random.randint(0,5,(50,50)) nbrs = NearestNeighbors(n_neighbors=5, algorithm='auto',

How to bucket locality-sensitive hashes?

阅读更多关于 How to bucket locality-sensitive hashes?

问题 I already have the algorithm to produce locality-sensitive hashes, but how should I bucket them to take advantage of their characteristics(i.e. similar elements have near hashes(with the hamming distance))? In the matlab code I found they simply create a distance matrix between the hashes of the points to search and the hashes of the points in the database, to simplify the code,while referencing a so called Charikar method for an actually good implementation of the search method. I tried to

kNN with big sparse matrices in Python

阅读更多关于 kNN with big sparse matrices in Python

问题 I have two large sparse matrices: In [3]: trainX Out[3]: <6034195x755258 sparse matrix of type '<type 'numpy.float64'>' with 286674296 stored elements in Compressed Sparse Row format> In [4]: testX Out[4]: <2013337x755258 sparse matrix of type '<type 'numpy.float64'>' with 95423596 stored elements in Compressed Sparse Row format> About 5 GB RAM in total to load. Note these matrices are HIGHLY sparse (0.0062% occupied). For each row in testX , I want to find the Nearest Neighbor in trainX and

kNN with big sparse matrices in Python

阅读更多关于 kNN with big sparse matrices in Python