SQL Query For Total Points Within Radius of a Location

前端 未结 3 1225
一整个雨季
一整个雨季 2021-02-11 07:43

I have a database table of all zipcodes in the US that includes city,state,latitude & longitude for each zipcode. I also have a database table of points that each have a lat

相关标签:
3条回答
  • 2021-02-11 08:04

    When I do these type of searches, my needs allow some approximation. So I use the formula you have in your second query to first calculate the "bounds" -- the four lat/long values at the extremes of the allowed radius, then take those bounds and do a simple query to find the matches within them (less than the max lat, long, more than the minimum lat, long). So what I end up with is everything within a square sitting inside the circle defined by the radius.

    0 讨论(0)
  • 2021-02-11 08:15

    MySQL Guru or not, the problem is that unless you find a way of filtering out various rows, the distance needs to be calculated between each point and each city...

    There are two general approaches that may help the situation

    • make the distance formula simpler
    • filter out unlikely candidates to the 100k radius from a given city

    Before going into these two avenue of improvement, you should decide on the level of precision desired with regard to this 100 miles distance, also you should indicate which geographic area is covered by the database (is this just continental USA etc.

    The reason for this is that while more precise numerically, the Great Circle formula, is very computationally expensive. Another avenue of performance improvement would be to store "Grid coordinates" of sorts in addtion (or instead of) the Lat/Long coordinates.

    Edit:
    A few ideas about a simpler (but less precise) formula:
    Since we're dealing with relatively small distances, (and I'm guessing between 30 and 48 deg Lat North), we can use the euclidean distance (or better yet the square of the euclidean distance) rather than the more complicated spherical trigonometry formulas.
    depending on the level of precision expected, it may even be acceptable to have one single parameter for the linear distance for a full degree of longitude, taking something average over the area considered (say circa 46 statute miles). The formula would then become

      LatDegInMi = 69.0
      LongDegInMi = 46.0
      DistSquared = ((Lat1 - Lat2) * LatDegInMi) ^2 + ((Long1 - Long2) * LongDegInMi) ^2
    

    On the idea of a columns with grid info to filter to limit the number of rows considered for distance calculation.
    Each "point" in the system, be it a city, or another point (?delivery locations, store locations... whatever) is assigned two integer coordinate which define the square of say 25 miles * 25 miles where the point lies. The coordinates of any point within 100 miles from the reference point (a given city), will be at most +/- 4 in the x direction and +/- 4 in the y direction. We can then write a query similar to the following

    SELECT city, state, latitude, longitude, COUNT(*)
    FROM zipcodes Z
    JOIN points P 
      ON P.GridX IN (
        SELECT GridX - 4, GridX - 3, GridX - 2, GridX - 1, GridX, GridX +1, GridX + 2 GridX + 3, GridX +4
       FROM zipcode ZX WHERE Z.id = ZX.id)
      AND
       P.GridY IN (
        SELECT GridY - 4, GridY - 3, GridY - 2, GridY - 1, GridY, GridY +1, GridY + 2 GridY + 3, GridY +4
       FROM zipcode ZY WHERE Z.id = ZY.id)
    WHERE P.Status = A
       AND ((Z.latitude - P.latitude) * LatDegInMi) ^2 
          + ((Z.longitude - P.longitude) * LongDegInMi) ^2 < (100^2)
    GROUP BY city,state,latitude,longitude;
    

    Note that the LongDegInMi could either be hardcoded (same for all locations within continental USA), or come from corresponding record in the zipcodes table. Similarly, LatDegInMi could be hardcoded (little need to make it vary, as unlike the other it is relatively constant).

    The reason why this is faster is that for most records in the cartesian product between the zipcodes table and the points table, we do not calculate the distance at all. We eliminate them on the basis of a index value (the GridX and GridY).

    This brings us to the question of which SQL indexes to produce. For sure, we may want: - GridX + GridY + Status (on the points table) - GridY + GridX + status (possibly) - City + State + latitude + longitude + GridX + GridY on the zipcodes table

    An alternative to the grids is to "bound" the limits of latitude and longitude which we'll consider, based on the the latitude and longitude of the a given city. i.e. the JOIN condition becomes a range rather than an IN :

    JOIN points P 
      ON    P.latitude > (Z.Latitude - (100 / LatDegInMi)) 
        AND P.latitude < (Z.Latitude + (100 / LatDegInMi)) 
        AND P.longitude > (Z.longitude - (100 / LongDegInMi)) 
        AND P.longitude < (Z.longitude + (100 / LongDegInMi)) 
    
    0 讨论(0)
  • 2021-02-11 08:26
    SELECT * FROM tblLocation 
        WHERE 2 > POWER(POWER(Latitude - 40, 2) + POWER(Longitude - -90, 2), .5)
    

    where the 2 > part would be the number of parallels away and 40 and -90 are lat/lon of the test point

    Sorry I didn't use your tablenames or structures, I just copied this out of one of my stored procedures I have in one of my databases.

    If I wanted to see the number of points in a zip code I suppose I would do something like this:

    SELECT 
        ParcelZip, COUNT(LocationID) AS LocCount 
    FROM 
        tblLocation 
    WHERE 
        2 > POWER(POWER(Latitude - 40, 2) + POWER(Longitude - -90, 2), .5)
    GROUP BY 
        ParcelZip
    

    Getting the total count of all locations in the range would look like this:

    SELECT 
        COUNT(LocationID) AS LocCount 
    FROM 
        tblLocation 
    WHERE 
        2 > POWER(POWER(Latitude - 40, 2) + POWER(Longitude - -90, 2), .5)
    

    A cross join may be inefficient here since we are talking about a large quantity of records but this should do the job in a single query:

    SELECT 
        ZipCodes.ZipCode, COUNT(PointID) AS LocCount 
    FROM
        Points
    CROSS JOIN 
        ZipCodes
    WHERE 
        2 > POWER(POWER(Points.Latitude - ZipCodes.Latitude, 2) + POWER(Points.Longitude - ZipCodes.Longitude, 2), .5)
    GROUP BY 
        ZipCodeTable.ZipCode
    
    0 讨论(0)
提交回复
热议问题