minimize euclidean distance from sets of points in n-dimensions

问题

Let's look at m points in n-d space- (A solution for 4 points in 3-d space is here: minimize distance from sets of points)

a= (x1, y1, z1, ..)

b= (x2, y2 ,z2, ..)

c= (x3, y3, z3, ..)
.
.

p= (x , y , z, ..)

Find point q = c1* a + c2* b + c3* c + ..

where c1 + c2 + c3 + .. = 1
and  c1, c2, c3, .. >= 0
s.t.
euclidean distance pq is minimized.

What algorithms can be used ? Idea or pseudocode is enough. (Optimizing performance is a big issue here. Monte Carlo method with all vertices and changing coefficients would also give a solution.)

回答1:

We can assume p = 0 by subtracting p from all the other points. Then the question is one of minimizing the norm over a convex hull of a finite set of points, i.e., a polytope.

There are a few papers on this problem. It looks like "A recursive algorithm for finding the minimum norm point in a polytope and a pair of closest points in two polytopes" by Kazuyuki Sekitani and Yoshitsugu Yamamoto is a good one, with a short survey of prior solutions to the problem. It is behind a paywall but if you have access to a university library you may be able to download a copy.

The algorithm they give is fairly simple, once you get past the notation. P is the finite set of points. C(P) is its convex hull. Nr(C(P)) is the unique point of minimum norm, which is what you want to find.

Step 0: Choose a point x_0 from the convex hull C(P) of your finite set of points P. They recommend choosing x_0 to be the point in P with minimum norm. Let k=1.

Now loop:

Step 1: Let a_k = min {x^t_{k-1} p | p is in P}. Here x^t_{k-1} is the transpose of x_{k-1} (so the function being minimized is just a dot product as p ranges over your finite set P). If |x_{k-1}|^2 <= a_k, then the answer is x_{k-1}, stop.

Step 2: P_k = {p | p in P and x^t_{k-1} = a_k}. P_k is the subset of P that minimizes the expression in Step 1. Call the algorithm recursively on this set P_k, and let the result be y_k = Nr(C(P_k)).

Step 3: b_k = min{y^t_k p | p in P\P_k}, the minimum of the dot product of y_k with points in the complement set P\P_k. If |y_k|^2 <= b_k then y_k is the answer, stop.

Step 4: s_k = max{s| [(1-s)x_{k-1} + sy_k]^t y_k <= [(1-s)x_{k-1} + sy_k]^t p for every p in P\P_k}. Let x_k = (1-s_k) x_{k-1} + s_k y_k, let k=k+1, and go back to Step 1.

There is an explicit formula for s_k in Step 4:

s_k = min{ [x^t_{k-1} (p-y_k)]/[(y_k-x_{k-1})^t (y_k-p)] | p in P\P_k and (y_k - x_{k-1})^t (y_k-p) > 0 }

There is a proof in the paper that s_k has the necessary properties, that the algorithm terminates after a finite number of operations, and that the result is indeed optimal.

Note that you should add some tolerance into your comparisons, otherwise rounding errors may cause the algorithm to fail. There is a lot of discussion about numerical stability, see the paper for details.

They do not give a complete analysis of the computational complexity of the algorithm, but they do prove it is at most O(m^2) in the two-dimensional case (m is the number of points in P), and they have done numerical experiments which give the impression that it is sublinear in time as a function of m, with dimension fixed. I'm skeptical of that claim. In the absence of a detailed analysis, I suggest you try some experiments with typical data to see how well the algorithm performs for you.

回答2:

Stated a simpler way, you have a set of points {a}_i, and you are considering all points which are some weighted average thereof. This set of points is exactly the convex hull of those points; it's a polytope (polygon, polyhedron, etc.) that just happens to be convex, where the corners are a subset of the {a}_i points.

You are just asking which point on a polytope(~hedron) is closest to a point. (your query point p)

The closest point must be on the exterior of the polytope. One algorithm would be to brute-force searching all N-1 dimensional surfaces. Do this in the usual way you would find the closest point on a line or surface or N-dimensional surface to a query point.

(If the points are not all linearly independent, you will have multiple ways (multiple weight vectors) which can give you the same weighted-average point q. You can worry about reconstructing the answer q from the basis vectors after you find it geometrically.)

来源：https://stackoverflow.com/questions/52134596/minimize-euclidean-distance-from-sets-of-points-in-n-dimensions

标签

python

geometry

projection

algebra