increasing performance on a SELECT query with large 3D point data set

前端 未结 2 909
予麋鹿
予麋鹿 2021-02-09 16:28

I have a large dataset (around 1.9 million rows) of 3D points that I\'m selecting from. The statement I use most often is similar to:

SELECT * FROM points 
WHERE         


        
相关标签:
2条回答
  • 2021-02-09 17:07

    B-Tree indexes won't help much for such a query.

    What you need as an R-Tree index and the minimal bounding parallelepiped query over it.

    Unfortunately, MySQL does not support R-Tree indexes over 3d points, only 2d. However, you may create an index over, say, X and Y together which will be more selective that any of the B-Tree indexes on X and Y alone:

    ALTER TABLE points ADD xy POINT;
    
    UPDATE  points
    SET     xy = Point(x, y);
    
    ALTER TABLE points MODIFY xy POINT NOT NULL;
    
    
    CREATE SPATIAL INDEX sx_points_xy ON points (xy);
    
    SELECT  *
    FROM    points
    WHERE   MBRContains(LineString(Point(100, 100), Point(200, 200), xy)
            AND z BETWEEN 100 and 200
            AND otherParameter > 10;
    

    This is only possible if your table is MyISAM.

    0 讨论(0)
  • 2021-02-09 17:21

    I don't have mySQL to test but I'm curious how efficient its INTERSECT is:

         select points.*
         from points 
         join 
         ( 
         select id from points where   x > 100 AND x < 200 
         intersect
         select id from points where   y > 100 AND y < 200 
         intersect
         select id from points where   z > 100 AND z < 200 
         ) as keyset
         on points.id = keyset.id
    

    Not necessarily recommending this -- but it's something to try, especially if you have separate indexes on x, y, and z.

    EDIT: Since mySQl doesn't support INTERSECT the query above could be rewritten using JOINS of inline views. Each view would contain a keyset and each view would have the advantage of the separate indexes you have placed on x, y, and z. The performance would depend on the numnber of keys returned and on the intersect/join algorithm.

    I first tested the intersect approach (in SQLite) to see if there were ways to improve performance in spatial queries short of using their R-Tree module. INTERSECT was actually slower than using a single non-composite index on one of the spatial values and then scanning the subset of the base table to get the other spatial values. But the results can vary depending on the size of the database. After the table has reached gargantuan size and disk i/o becomes more important as a performance factor, it may be more efficient to intersect discrete keysets, each of which has been instantiated from an index, than to do a scan of the base table subequent to an initial fetch-from-index.

    0 讨论(0)
提交回复
热议问题