How can I extend this SQL query to find the k nearest neighbors?

后端 未结 2 691

I have a database full of two-dimensional data - points on a map. Each record has a field of the geometry type. What I need to be able to do is pass a point to a stored procedur

相关标签:
2条回答
  • 2021-02-08 21:32

    What happens if you remove TOP (1) WITH TIES from the inner query, and set the outer query to return the top k rows?

    I'd also be interested to know whether this amendment helps at all. It ought to be more efficient than using TOP:

    DECLARE @start FLOAT = 1000
            ,@k INT = 20
            ,@p FLOAT = 2;
    
    WITH NearestPoints AS
    (
         SELECT *
                ,T.g.STDistance(@x) AS dist
                ,ROW_NUMBER() OVER (ORDER BY T.g.STDistance(@x)) AS rn
         FROM Numbers 
         JOIN T WITH(INDEX(spatial_index)) 
         ON   T.g.STDistance(@x) <  @start*POWER(@p,Numbers.n)
         AND (Numbers.n - 1 = 0 
              OR T.g.STDistance(@x) >= @start*POWER(@p,Numbers.n - 1)
             )
    )
    SELECT * 
    FROM NearestPoints
    WHERE rn <= @k;
    

    NB - untested - I don't have access to SQL 2008 here.

    0 讨论(0)
  • 2021-02-08 21:36

    Quoted from Inside Microsoft® SQL Server® 2008: T-SQL Programming. Section 14.8.4.

    The following query will return the 10 points of interest nearest to @input:

    DECLARE @input GEOGRAPHY = 'POINT (-147 61)';
    DECLARE @start FLOAT = 1000;
    WITH NearestNeighbor AS(
      SELECT TOP 10 WITH TIES
        *, b.GEOG.STDistance(@input) AS dist
      FROM Nums n JOIN GeoNames b WITH(INDEX(geog_hhhh_16_sidx)) -- index hint
      ON b.GEOG.STDistance(@input) < @start*POWER(CAST(2 AS FLOAT),n.n)
      AND b.GEOG.STDistance(@input) >=
        CASE WHEN n = 1 THEN 0 ELSE @start*POWER(CAST(2 AS FLOAT),n.n-1) END
      WHERE n <= 20
      ORDER BY n
    )
      SELECT TOP 10 geonameid, name, feature_code, admin1_code, dist
      FROM NearestNeighbor
      ORDER BY n, dist;
    

    Note: Only part of this query’s WHERE clause is supported by the spatial index. However, the query optimizer correctly evaluates the supported part (the "<" comparison) using the index. This restricts the number of rows for which the ">=" part must be tested, and the query performs well. Changing the value of @start can sometimes speed up the query if it is slower than desired.

    Listing 2-1. Creating and Populating Auxiliary Table of Numbers

    SET NOCOUNT ON;
    USE InsideTSQL2008;
    
    IF OBJECT_ID('dbo.Nums', 'U') IS NOT NULL DROP TABLE dbo.Nums;
    
    CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY);
    DECLARE @max AS INT, @rc AS INT;
    SET @max = 1000000;
    SET @rc = 1;
    
    INSERT INTO Nums VALUES(1);
    WHILE @rc * 2 <= @max
    BEGIN
      INSERT INTO dbo.Nums SELECT n + @rc FROM dbo.Nums;
      SET @rc = @rc * 2;
    END
    
    INSERT INTO dbo.Nums
      SELECT n + @rc FROM dbo.Nums WHERE n + @rc <= @max;
    
    0 讨论(0)
提交回复
热议问题