increasing performance on a SELECT query with large 3D point data set

前端 未结 2 916
予麋鹿
予麋鹿 2021-02-09 16:28

I have a large dataset (around 1.9 million rows) of 3D points that I\'m selecting from. The statement I use most often is similar to:

SELECT * FROM points 
WHERE         


        
2条回答
  •  忘了有多久
    2021-02-09 17:21

    I don't have mySQL to test but I'm curious how efficient its INTERSECT is:

         select points.*
         from points 
         join 
         ( 
         select id from points where   x > 100 AND x < 200 
         intersect
         select id from points where   y > 100 AND y < 200 
         intersect
         select id from points where   z > 100 AND z < 200 
         ) as keyset
         on points.id = keyset.id
    

    Not necessarily recommending this -- but it's something to try, especially if you have separate indexes on x, y, and z.

    EDIT: Since mySQl doesn't support INTERSECT the query above could be rewritten using JOINS of inline views. Each view would contain a keyset and each view would have the advantage of the separate indexes you have placed on x, y, and z. The performance would depend on the numnber of keys returned and on the intersect/join algorithm.

    I first tested the intersect approach (in SQLite) to see if there were ways to improve performance in spatial queries short of using their R-Tree module. INTERSECT was actually slower than using a single non-composite index on one of the spatial values and then scanning the subset of the base table to get the other spatial values. But the results can vary depending on the size of the database. After the table has reached gargantuan size and disk i/o becomes more important as a performance factor, it may be more efficient to intersect discrete keysets, each of which has been instantiated from an index, than to do a scan of the base table subequent to an initial fetch-from-index.

提交回复
热议问题