OK - I\'ve been wrestling with this for about 3 months on and off and since I\'ve exhausted every geo proximity formula out there that I\'ve come across and I\'m no closer to ge
Here's a solution I used successfully for a while in my own geo proximity calculations:
/**
* This portion of the routine calculates the minimum and maximum lat and
* long within a given range. This portion of the code was written
* by Jeff Bearer (http:return true;//www.jeffbearer.com).
*/
$lat = somevalue; // The latitude of our search origin
$lon = someothervalue; // The longitude of our search origin
$range = 50; // The range of our search, in miles, of your zip
// Find Max - Min Lat / Long for Radius and zero point and query only zips in that range.
$lat_range = $range / 69.172;
$lon_range = abs($range / (cos($lon) * 69.172));
$min_lat = number_format($lat - $lat_range, '4', '.', '');
$max_lat = number_format($lat + $lat_range, '4', '.', '');
$min_lon = number_format($lon - $lon_range, '4', '.', '');
$max_lon = number_format($lon + $lon_range, '4', '.', '');
/* Query for matching zips:
SELECT post_id, lat, lng
FROM wp_geodatastore
WHERE
lat BETWEEN $min_lat AND $max_lat
AND lng BETWEEN $min_lon AND $max_lon
*/
Thinking a little laterally I've come up with a 'sort of' solution to the problem of the missing markers. The two equations I posted originally gave the correct results but each missed out either markers close to the target or on the edges of the search radius
It's not very elegant but I figured that running both equations and producing 2 arrays which I then combined (removing any duplicates) would give me all the markers I'm looking for. This does work (obviously a performance hit but it's not a high traffic application) so I'll work with this for the time being but I'm still after a more practical solution if anyone has one!
EDIT This location-finder comes up often enough that I've written an article on it.
http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/
Original Post
Let's start by dealing with the haversine formula once for all, by putting it into a stored function so we can forget about its gnarly details. NOTE: This whole solution is in statute miles.
DELIMITER $$
CREATE
FUNCTION distance(lat1 FLOAT, long1 FLOAT, lat2 FLOAT, long2 FLOAT)
RETURNS FLOAT
DETERMINISTIC NO SQL
BEGIN
RETURN (3959 * ACOS(COS(RADIANS(lat1))
* COS(RADIANS(lat2))
* COS(RADIANS(long1) - RADIANS(long2))
+ SIN(RADIANS(lat1))
* SIN(RADIANS(lat2))
));
END$$
DELIMITER ;
Now let's put together a query that searches on the bounding box, and then refines the search with our distance function and orders by distance
Based on the PHP code in the your question:
Assume $radius
is your radius, $center_lat
, $center_lng
is your reference point.
$sqlsquareradius = "
SELECT post_id, lat, lng
FROM
(
SELECT post_id, lat, lng,
distance(lat, lng, " . $center_lat . "," . $center_lng . ") AS distance
FROM wp_geodatastore
WHERE lat >= " . $center_lat . " -(" . $radius . "/69)
AND lat <= " . $center_lat . " +(" . $radius . "/69)
AND lng >= " . $center_lng . " -(" . $radius . "/69)
AND lng <= " . $center_lng . " +(" . $radius . "/69)
)a
WHERE distance <= " . $radius . "
ORDER BY distance
";
Notice a few things about this.
First, it does the bounding box computation in SQL rather than in PHP. There's no good reason for that, except keeping all the computation in one environment. (radius / 69)
is the number of degrees in radius
statute miles.
Second, it doesn't fiddle with the size of the longitudinal bounding box based on latitude. Instead it uses a simpler, but slightly too large, bounding box. This bounding box catches a few extra records, but the distance measurement gets rid of them. For your typical postcode / store finder app the performance difference is negligible. If you were searching many more records (e.g. a database of all utility poles) it might not be so trivial.
Third, it uses a nested query to do the distance elimination, to avoid having to run the distance function more than once for each item.
Fourth, it orders by distance ASCENDING. This means your zero-distance results should show up first in the result set. It usually makes sense to list nearest things first.
Fifth, it uses FLOAT
rather than DOUBLE
throughout. There's a good reason for that. The haversine distance formula is not perfect, because it makes the approximation that the earth is a perfect sphere. That approximation happens to break down at roughly the same level of accuracy as the epsilon for FLOAT
numbers. So DOUBLE
is deceptive numerical overkill for this problem. (Don't use this haversine formula to do civil engineering work like parking lot drainage, or you will get big puddles a couple of epsilon, a few inches, deep, I promise.) It's fine for store-finder applications.
Sixth, you are definitely going to want to create an index for your lat
column. If your table of locations doesn't change very often, it will help to create an index for your lng
column as well. But your lat
index will give you most of your query performance gain.
Lastly, I tested the stored procedure and the SQL, but not the PHP.
Reference: http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL Also my experience with a bunch of proximity finders for health care facilities.
--------------- EDIT --------------------
If you don't have a user interface that lets you define a stored procedure, that's a nuisance. At any rate, PHP lets you use numbered parameters in the sprintf call, so you can generate the whole nested statement like this. NOTE: You might need %$1f etc. You'll need to experiment with this.
$sql_stmt = sprintf ("
SELECT post_id, lat, lng
FROM
(
SELECT post_id, lat, lng,
(3959 * ACOS(COS(RADIANS(lat))
* COS(RADIANS(%$1s))
* COS(RADIANS(lng) - RADIANS(%$2s))
+ SIN(RADIANS(lat))
* SIN(RADIANS(%$1s))
))
AS distance
FROM wp_geodatastore
WHERE lat >= %$1s -(%$3s/69)
AND lat <= %$1s +(%$3s/69)
AND lng >= %$2s -(%$3s/69)
AND lng <= %$2s +(%$3s/69)
)a
WHERE distance <= %$3s
ORDER BY distance
",$center_lat,$center_lng, $radius);
This is code from a working production system,
6371.04 * acos(cos(pi()/2-radians(90-wgs84_lat)) * cos(pi()/2-radians(90-$lat)) * cos(radians(wgs84_long)-radians($lon)) + sin(pi()/2-radians(90-wgs84_lat)) * sin(pi()/2-radians(90-$lat))) as distance
Uses a different distance formular, but for a store locator the difference is minimal.
You can try my class at http://www.phpclasses.org/package/6202-PHP-Generate-points-of-an-Hilbert-curve.html. It uses the harvesine formula and a hilbert curve to compute a quadkey. You can then search the quadkey from left to right. Every position of the key is a point on the monster curve. A better explanation of the curve can be found at Nick's spatial index quadtree hilbert curve blog. It's like using the spatial index extension from mysql but you have more control. You can use a z curve or moore curve or you can change the look.