Join operation in Haversine formula

后端 未结 2 1756
悲&欢浪女
悲&欢浪女 2021-02-10 18:29

I am implementing Haversine formula in PHP as the follows

$result=mysqli_query($mysqli,\"SELECT *,( 6371 * acos( cos( radians({$lat}) ) * cos( radians( `latitude         


        
2条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-10 19:05

    Any operation you do here which operates on all rows will be slow with that many records.

    What you need to do is take advantage of index's. To use an index it must be a simple query and NOT the result of a function (as it is currently).

    What you are doing by doing a radius search is making a circle around a point, By using some trig before the circle is made we can come up with the following

    where S1 is the largest square inside, and S2 is the smallest square outside.

    Now we can work out the dimensions of these two squares and anything OUTSIDE of S2 is hit by and index, and anything INSIDE of S1 is hit by an index, leaving only the small area inside which now needs to be looked up using the slow method.

    If you need the distance from the point ignore the S1 sections (as everything inside of the circle needs the haversine function) as a note here, while everything inside of the circle needs it, not every point is within the distance, so both WHERE clauses are still needed

    So lets calculate these points using the unit circle

    function getS1S2($latitude, $longitude, $kilometer)
    {
        $radiusOfEarthKM  = 6371;
        $latitudeRadians  = deg2rad($latitude);
        $longitudeRadians = deg2rad($longitude);
        $distance         = $kilometer / $radiusOfEarthKM;
    
        $deltaLongitude = asin(sin($distance) / cos($latitudeRadians));
    
        $bounds = new \stdClass();
    
        // these are the outer bounds of the circle (S2)
        $bounds->minLat  = rad2deg($latitudeRadians  - $distance);
        $bounds->maxLat  = rad2deg($latitudeRadians  + $distance);
        $bounds->minLong = rad2deg($longitudeRadians - $deltaLongitude);
        $bounds->maxLong = rad2deg($longitudeRadians + $deltaLongitude);
    
        // and these are the inner bounds (S1)
        $bounds->innerMinLat  = rad2deg($latitudeRadians  + $distance       * cos(5 * M_PI_4));
        $bounds->innerMaxLat  = rad2deg($latitudeRadians  + $distance       * sin(M_PI_4));
        $bounds->innerMinLong = rad2deg($longitudeRadians + $deltaLongitude * sin(5 * M_PI_4));
        $bounds->innerMaxLong = rad2deg($longitudeRadians + $deltaLongitude * cos(M_PI_4));
    
        return $bounds;
    }
    

    Now your query becomes

    SELECT 
      *
    FROM
      `places` 
    HAVING p.nlatitude BETWEEN {$bounds->minLat} 
      AND {$bounds->maxLat} 
      AND p.nlongitude BETWEEN {$bounds->minLong} 
      AND {$bounds->maxLong} 
      AND (
        (
          p.nlatitude BETWEEN {$bounds->innerMinLat} 
          AND {$bounds->innerMaxLat} 
          AND p.nlongitude BETWEEN {$bounds->innerMinLong} 
          AND {$bounds->innerMaxLong}
        ) 
        OR (
          6371 * ACOS(
            COS(RADIANS({ $lat })) * COS(RADIANS(`latitude`)) * COS(
              RADIANS(`longitude`) - RADIANS({ $lon })
            ) + SIN(RADIANS({ $lat })) * SIN(RADIANS(`latitude`))
          )
        )
      )) <= {$radius} 
    ORDER BY distance ASC 
    

    IMPORTANT

    The above has text for readability, Please ensure these values are escaped correctly / preferably parameterized

    This then can take advantage of the index, and allow the join to happen in a faster time

    Adding the join this becomes

    SELECT 
      *
    FROM
      `places` p
      INNER JOIN my_friends f ON f.id = p.id
    WHERE   p.latitude BETWEEN {$bounds->minLat} 
      AND {$bounds->maxLat} 
      AND p.longitude BETWEEN {$bounds->minLong} 
      AND {$bounds->maxLong} 
      AND (
        (
          p.latitude BETWEEN {$bounds->innerMinLat} 
          AND {$bounds->innerMaxLat} 
          AND p.longitude BETWEEN {$bounds->innerMinLong} 
          AND {$bounds->innerMaxLong}
        ) 
        OR (
          6371 * ACOS(
            COS(RADIANS({ $lat })) * COS(RADIANS(`latitude`)) * COS(
              RADIANS(`longitude`) - RADIANS({ $lon })
            ) + SIN(RADIANS({ $lat })) * SIN(RADIANS(`latitude`))
          )
        )
      )  <= {$radius} 
      AND f.personal_id = {$personal_id}
    ORDER BY distance ASC 
    

    IMPORTANT

    The above has text for readability, Please ensure these values are escaped correctly / preferably parameterized

    Assuming you have the correct indexes this query should remain fast and allow you to do the join.

    Looking at the code above im not sure where personal_id comes from so have left as it is

    if you need the distance from the query, you can remove the S1 square

        (
          p.latitude BETWEEN {$bounds->innerMinLat} 
          AND {$bounds->innerMaxLat} 
          AND p.longitude BETWEEN {$bounds->innerMinLong} 
          AND {$bounds->innerMaxLong}
        ) 
    

    and move the second part of that OR

      6371 * ACOS(
        COS(RADIANS({ $lat })) * COS(RADIANS(`latitude`)) * COS(
          RADIANS(`longitude`) - RADIANS({ $lon })
        ) + SIN(RADIANS({ $lat })) * SIN(RADIANS(`latitude`))
      )
    

    back to the select, which still makes use of S2.

    I would also make sure to remove the "magic number" in the query 6371 is the radius of the earth in Kilometer

提交回复
热议问题