Consider the following 2 tables:
Table A:
id
event_time
Table B
id
start_time
end_time
Every record in table A is mapped to exactly 1 reco
There are two caveats to my solution:
1) You said that you can add indexes but not change the schema so I'm not sure if this would work for you or not as you can't have function based indexes in MySQL and you would need to create an extra column on Table B. 2) The other caveat to this solution is that you must be using the MyISAM engine for Table B. If you cannot use MyISAM then this solution wont work because only MyISAM is supported for Spatial Indexes.
So, assuming that the above two aren't an issue for you, the following should work and give you good performance:
This solution makes use of MySQL's support for Spatial Data (see documentation here). While spatial data types can be added to a variety of storage engines, only MyISAM is supported for Spatial R-Tree Indexes (see documentation here) which are needed in order to get the performance needed. One other limitation is that spatial data types only work with numerical data so you cannot use this technique with string based range queries.
I wont go into the details of the theory behind how spatial types work and how the spatial index is useful but you should look at Jeremy Cole's explanation here in regards to how to use spatial data types and indexes for GeoIP lookups. Also look at the comments as they raise some useful points and alternative if you need raw performance and can give up some accuracy.
The basic premise is that we can take the start/end and use the two of them to create four distinct points, one for each corner of a rectangle centered around 0,0 on a xy grid, and then do a quick lookup into the spatial index to determine if the particular point in time we care about is within the rectangle or not. As mentioned previously, see Jeremy Cole's explanation for a more thorough overview of how this works.
In your particular case we will need to do the following:
1) Alter the table to be a MyISAM table (note you shouldn't do this unless you are fully aware of the consequences of such a change like the lack of transactions and the table locking behavior that are associated with MyISAM).
alter table B engine = MyISAM;
2) Next we add the new column that will hold the spatial data. We will use the polygon data type as we need to be able to hold a full rectangle.
alter table B add column time_poly polygon NOT NULL;
3) Next we populate the new column with the data (please keep in mind that any processes that update or insert into table B will need to get modified to make sure they are populating the new column also). Since the start and end ranges are times, we will need to convert them to numbers with the unix_timestamp function (see documentation here for how it works).
update B set time_poly := LINESTRINGFROMWKB(LINESTRING(
POINT(unix_timestamp(start_time), -1),
POINT(unix_timestamp(end_time), -1),
POINT(unix_timestamp(end_time), 1),
POINT(unix_timestamp(start_time), 1),
POINT(unix_timestamp(start_time), -1)
));
4) Next we add the spatial index to the table (as mentioned previously, this will only work for a MyISAM table and will produce the error "ERROR 1464 (HY000): The used table type doesn't support SPATIAL indexes").
alter table B add SPATIAL KEY `IXs_time_poly` (`time_poly`);
5) Next you will need to use the following select in order to make use of the spatial index when querying the data.
SELECT A.id, B.id
FROM A inner join B force index (IXs_time_poly)
ON MBRCONTAINS(B.time_poly, POINTFROMWKB(POINT(unix_timestamp(A.event_time), 0)));
The force index is there to make 100% sure that MySQL will use the index for the lookup. If everything went well running an explain on the above select should show something similar to the following:
mysql> explain SELECT A.id, B.id
-> FROM A inner join B force index (IXs_time_poly)
-> on MBRCONTAINS(B.time_poly, POINTFROMWKB(POINT(unix_timestamp(A.event_time), 0)));
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------------------------------------------+
| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 1065 | |
| 1 | SIMPLE | B | ALL | IXs_time_poly | NULL | NULL | NULL | 7969897 | Range checked for each record (index map: 0x10) |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------------------------------------------+
2 rows in set (0.00 sec)
Please refer to Jeremy Cole's analysis for details about the performance benefits of this method as compared with a between clause.
Let me know if you have any questions.
Thanks,
-Dipin