This problem is a kind of closest pair between two disjoint set. Upperside picture
I worked on a similar problem where I had to find a nearest member to identify if a member belong to a cluster within a cluster. I was trying to identify clusters within clusters. Here is the code, This might help you get start.
/**
* Find the nearest neighbor based on the distance threshold.
* TODO:
* @param currentPoint current point in the memory.
* @param threshold dynamic distance threshold.
* @return return the neighbor.
*/
private double nearestNeighbor(double currentPoint) {
HashMap<Double, Double> unsorted = new HashMap<Double, Double>();
TreeMap<Double, Double> sorted = null;
double foundNeighbor = 0.0;
for (int i = 0; i < bigCluster.length; i++) {
if (bigCluster[i] != 0.0 && bigCluster[i] != currentPoint) {
double shortestDistance = Math.abs(currentPoint - bigCluster[i]);
if (shortestDistance <= this.getDistanceThreshold())
unsorted.put(shortestDistance, bigCluster[i]);
}
}
if (!unsorted.isEmpty()) {
sorted = new TreeMap<Double, Double>(unsorted);
this.setDistanceThreshold(avgDistanceInCluster());
foundNeighbor = sorted.firstEntry().getValue();
return foundNeighbor;
} else {
return 0.0;
}
}
/**
* Method will check if a point belongs to a cluster based on the dynamic
* threshold.
*/
public void isBelongToCluster() {
for (int i=0; i < tempList.size(); i++) {
double aPointInCluster = tempList.get(i);
cluster.add(aPointInCluster);
double newNeighbor = nearestNeighbor(aPointInCluster);
if ( newNeighbor != 0.0) {
cluster.add(newNeighbor);
if (i + 1 > tempList.size() && (visited[i] != true)) {
isBelongToCluster();
}
}
}
for (int i=0; i < cluster.size(); i++) {
if (cluster.get(i) != 0.0)
System.out.println("whats in the cluster -> " + cluster.get(i));
}
}
here is one more solution. What it basically does is loads both of the set of points into two kd tree instances (which have built mechanism for slicing the tree on x and y axis) and then navigates through both of the trees by checking each node if the distances between the two less than the distance between the prior nodes, if yes then continue until no two nodes can be found with the mutual distance less than any other.
the below code prints the distances found while navigating between the nodes and prints them. The both set of points are also be visualized to see the correctness of the algorithem.
This code can correctly find the nearest nodes regardless whether one set is nested, on right, left, up or downward of the other,
#include <iostream>
using namespace std;
int const k=2; // the number of dimensions
double min_distance = 10000; // set a large default value, in this example all distance will be shorter than this.
double distance(int arr[], int arr2[])
{
return sqrt(pow(arr2[0] - arr[0], 2) + pow(arr2[1] - arr[1], 2));
}
struct Node {
int point[k];
Node *left, *right;
Node()
{
left = right = NULL;
}
};
// A method to create a node of K D tree
struct Node* newNode(int arr[])
{
struct Node* temp = new Node;
for (int i = 0; i<k; i++) temp->point[i] = arr[i];
return temp;
}
Node * insertNode(Node * node, int arr[], int d)
{
if (node == NULL)
return newNode(arr);
int dim = d%k;
if (node->point[dim] > arr[dim])
{
node->left = insertNode(node->left, arr, dim + 1);
}
else
{
node->right = insertNode(node->right, arr, dim + 1);
}
return node;
}
Node * Nearest=NULL;
Node * FindnearestNode(Node * head1, int arr[], int d)
{
// if empty tree, return
if (head1 == NULL)
return NULL;
// check for each tree.
if (min_distance > distance(head1->point, arr))
{
min_distance = distance(head1->point, arr);
Nearest = head1;
}
if (head1->left == NULL && head1->right == NULL)
return head1;
// findout current dimension, in this case it either x or y i.e. 0 or 1
int dim = d%k;
// navigate through the tree as if inserting to a new member (to remain to the nearest member in closeness). in the path for insert it will find the nearest member.
if (head1->right && head1->point[dim] < arr[dim]) return FindnearestNode(head1->right, arr, d+1);
else if(head1->left && head1->point[dim] > arr[dim] )
return FindnearestNode(head1->left, arr, d+1);
return Nearest;
}
int main()
{
int const an = 10;
int const bn = 10;
int ax[an] = { 34,55,11,79,77,65,3,9,5,66 };
int ay[an] = { 5, 6, 7, 9, 32,3,15,7,10,35 };
int bx[bn] = { 5,35,4,41,32,64,41,54,87,3 };
int by[bn] = { 23,33,17,15,32,22,33,23,21,32 };
Node * head1=NULL;
Node * head2 = NULL;
double Final_Min_Distance = min_distance;
// fill the K-D trees with the two dimensional data in two trees.
for (int i = 0; i < an; i++)
{
int temp[k];
temp[0] = ax[i];
temp[1] = ay[i];
head1=insertNode(head1, temp, 0);
temp[0] = bx[i];
temp[1] = by[i];
head2=insertNode(head2, temp, 0);
}
Node * AnearB=NULL;
Node * BnearA = NULL;
min_distance = 1000;
Final_Min_Distance = min_distance;
for (int i = 0; i < an; i++) { int temp[k]; temp[0] = bx[i]; temp[1] = by[i]; Node * Nearer2 = FindnearestNode(head1, temp, 0); if (Final_Min_Distance > min_distance)
{
BnearA = Nearer2;
Final_Min_Distance = min_distance;
}
cout << " distance of B (" << temp[0] << "," << temp[1] << ") to nearest A (" << BnearA->point[0] << "," << BnearA->point[1] << ") distance:" << Final_Min_Distance << endl;
min_distance = 1000;
}
cout << "Minimum Distance is " << Final_Min_Distance<<endl<<endl;
min_distance = 1000;
Final_Min_Distance = min_distance;
for (int i = 0; i < an; i++) { int temp[k]; temp[0] = ax[i]; temp[1] = ay[i]; Node * Nearer2 = FindnearestNode(head2, temp, 0); if (Final_Min_Distance > min_distance)
{
AnearB = Nearer2;
Final_Min_Distance = min_distance;
}
cout << " distance of A (" << temp[0] << "," << temp[1] << ") to nearest B (" << AnearB->point[0] << "," << AnearB->point[1] << ") distance:" << Final_Min_Distance << endl;
min_distance = 1000;
}
cout << "Minimum Distance is " << Final_Min_Distance;
system("pause");
}
https://tahirnaeem.net/finding-closest-pair-of-points-in-two-disjoint-sets-of-points
This problem is usually called the closest bichromatic pair problem. Here are a couple approaches.
Delaunay triangulation. (This certainly works with L2 (= Euclidean) distances; I think the steps generalize to L1.) For every Delaunay triangulation (there can be more than one in degenerate cases), there exists a minimum spanning tree whose edges all belong to the triangulation. In turn, this minimum spanning tree contains a shortest edge crossing the cut between the color classes.
Nearest neighbor data structures.
If it is given that the red points are separated in x from the blue points, then you may be able to adapt the O(n) merge step of the Shamos–Hoey divide-and-conquer algorithm for the closest (monochromatic) pair problem, described here.
If you want to do binary search on spatial data, you could use a spatial data structure, such as a quadtree or a k-d tree.