I have a large dataset (around 1.9 million rows) of 3D points that I\'m selecting from. The statement I use most often is similar to:
SELECT * FROM points
WHERE
B-Tree
indexes won't help much for such a query.
What you need as an R-Tree
index and the minimal bounding parallelepiped query over it.
Unfortunately, MySQL
does not support R-Tree
indexes over 3d
points, only 2d
. However, you may create an index over, say, X
and Y
together which will be more selective that any of the B-Tree
indexes on X
and Y
alone:
ALTER TABLE points ADD xy POINT;
UPDATE points
SET xy = Point(x, y);
ALTER TABLE points MODIFY xy POINT NOT NULL;
CREATE SPATIAL INDEX sx_points_xy ON points (xy);
SELECT *
FROM points
WHERE MBRContains(LineString(Point(100, 100), Point(200, 200), xy)
AND z BETWEEN 100 and 200
AND otherParameter > 10;
This is only possible if your table is MyISAM
.
I don't have mySQL to test but I'm curious how efficient its INTERSECT is:
select points.*
from points
join
(
select id from points where x > 100 AND x < 200
intersect
select id from points where y > 100 AND y < 200
intersect
select id from points where z > 100 AND z < 200
) as keyset
on points.id = keyset.id
Not necessarily recommending this -- but it's something to try, especially if you have separate indexes on x, y, and z.
EDIT: Since mySQl doesn't support INTERSECT the query above could be rewritten using JOINS of inline views. Each view would contain a keyset and each view would have the advantage of the separate indexes you have placed on x, y, and z. The performance would depend on the numnber of keys returned and on the intersect/join algorithm.
I first tested the intersect approach (in SQLite) to see if there were ways to improve performance in spatial queries short of using their R-Tree module. INTERSECT was actually slower than using a single non-composite index on one of the spatial values and then scanning the subset of the base table to get the other spatial values. But the results can vary depending on the size of the database. After the table has reached gargantuan size and disk i/o becomes more important as a performance factor, it may be more efficient to intersect discrete keysets, each of which has been instantiated from an index, than to do a scan of the base table subequent to an initial fetch-from-index.