问题
The Problem
Imagine I am stood in an airport. Given a geographic coordinate pair, how can one efficiently determine which airport I am stood in?
Inputs
- A coordinate pair
(x,y)
representing the location I am stood at. - A set of coordinate pairs
[(a1,b1), (a2,b2)...]
where each coordinate pair represents one airport.
Desired Output
A coordinate pair (a,b)
from the set of airport coordinate pairs representing the closest airport to the point (x,y)
.
Inefficient Solution
Here is my inefficient attempt at solving this problem. It is clearly linear in the length of the set of airports.
shortest_distance = None
shortest_distance_coordinates = None
point = (50.776435, -0.146834)
for airport in airports:
distance = compute_distance(point, airport)
if distance < shortest_distance or shortest_distance is None:
shortest_distance = distance
shortest_distance_coordinates = airport
The Question
How can this solution be improved? This might involve some way of pre-filtering the list of airports based on the coordinates of the location we are currently stood at, or sorting them in a certain order beforehand.
回答1:
If your coordinates are unsorted, your search can only be improved slightly assuming it is (latitude,longitude)
by filtering on latitude first as for earth
1 degree of latitude on the sphere is 111.2 km or 69 miles
but that would not give a huge speedup.
If you sort the airports by latitude first then you can use a binary search for finding the first airport that could match (airport_lat >= point_lat-tolerance
) and then only compare up to the last one that could match (airport_lat <= point_lat+tolerance
) - but take care of 0 degrees equaling 360. While you cannot use that library directly, the sources of bisect are a good start for implementing a binary search.
While technically this way the search is still O(n), you have much fewer actual distance calculations (depending on tolerance) and few latitude comparisons. So you will have a huge speedup.
回答2:
Using a k-dimensional tree:
>>> from scipy import spatial
>>> airports = [(10,10),(20,20),(30,30),(40,40)]
>>> tree = spatial.KDTree(airports)
>>> tree.query([(21,21)])
(array([ 1.41421356]), array([1]))
Where 1.41421356 is the distance between the queried point and the nearest neighbour and 1 is the index of the neighbour.
See: http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.query.html#scipy.spatial.KDTree.query
回答3:
From this SO question:
import numpy as np
def closest_node(node, nodes):
nodes = np.asarray(nodes)
deltas = nodes - node
dist_2 = np.einsum('ij,ij->i', deltas, deltas)
return np.argmin(dist_2)
where node
is a tuple with two values (x, y) and nodes
is an array of tuples with two values ([(x_1, y_1), (x_2, y_2),]
)
来源:https://stackoverflow.com/questions/39107896/efficiently-finding-the-closest-coordinate-pair-from-a-set-in-python