I am using a MySQL DB, and have the following table:
CREATE TABLE SomeTable (
PrimaryKeyCol BIGINT(20) NOT NULL,
A BIGINT(20) NOT NULL,
FirstX INT(11) N
WHERE col1 < ... AND ... < col2
is virtually impossible to optimize.
Any useful query will involve a "range" on either col1 or col2. Two ranges (on two different columns) cannot be used in a single INDEX
.
Therefore, any index you try has the risk of checking a lot of the table:
INDEX(col1, ...)
will scan from the start to where col1
hits ...
. Similarly for col2
and scanning until the end.
To add to your woes, the ranges are overlapping. So, you can't pull a fast one and add ORDER BY ... LIMIT 1
to stop quickly. And if you say LIMIT 10
, but there are only 9, it won't stop until the start/end of the table.
One simple thing you can do (but it won't speed things up by much) is to swap the PRIMARY KEY
and the UNIQUE
. This could help because InnoDB "clusters" the PK with the data.
If the ranges did not overlap, I would point you at http://mysql.rjweb.org/doc.php/ipranges .
So, what can be done?? How "even" and "small" are the ranges? If they are reasonably 'nice', then the following would take some code, but should be a lot faster. (In your example, 100000 500000
is pretty ugly, as you will see in a minute.)
Define buckets to be, say, floor(number/100). Then build a table that correlates buckets and ranges. Samples:
FirstX LastX Bucket
123411 123488 1234
222222 222444 2222
222222 222444 2223
222222 222444 2224
222411 222477 2224
Notice how some ranges 'belong' to multiple buckets.
Then, the search is first on the bucket(s) in the query, then on the details. Looking for X=222433 would find two rows with bucket=2224, then decide that both are OK. But for X=222466, two rows have the bucket, but only one matches with firstX and lastX.
WHERE bucket = FLOOR(X/100)
AND firstX <= X
AND X <= lastX
with
INDEX(bucket, firstX)
But... with 100000 500000
, there would be 4001 rows because this range is in that many 'buckets'.
Plan B (to tackle the wide ranges)
Segregate the ranges into wide and narrow. Do the wide ranges by a simple table scan, do the narrow ranges via my bucket method. UNION ALL
the results together. Hopefully the "wide" table would much smaller than the "narrow" table.