MySQL composite indexes and operator BETWEEN

两盒软妹~` 提交于 2019-12-04 06:09:02

Your style is very uncommon.

Most people would probably write WHERE began_at < NOW() AND finished_at > NOW()

However. I would recommend putting an index on both fields.

A combined key wont be of use to you because you it would only speed up searcher for specific date combinations.

Well this is not entirely true because if you use betree a combined key will help you but not as good as if you index them seperately. Combined keys are very good if you search combinations of fields with equality (=) operator. SIngle field indexes perform better in ragen requests.

You can google a bit for "multidimensional range search".

The reason is that all matching fields in one field can be basically found in log(n) time in btrees. So your overall runtime will be O(k*log(n)) which is O(log(n)).

Multidimensional Range queries have a runtime of O(sqrt(n)) which is higher. However there are better implementations as well which also acheav logarithmic runtime. However they are not fully implemented in mysql, so it will be worse or awful depending on the version.

So let me sum up:

  • Equality comparisions on single fields: hash index (runtime O(1))

  • Range search on single fields: btree index on single fields ( O(log(n)) )

  • Equality search on multiple fields: combined hash key (runtime O(1))

those cases are a clear thing...

  • Range search on multiple fields: seperate btree indexes ( O(log(n)) )

this is where its not so clear. with current versions its clearly better to index seperately because of the reasons given above. With a perfect implementation for that use case you could achieve better performance with combined keys but there is no system in know of which supports it. mysql supports loose indexes (which you need for that) since version 5.0, but only very limited and the query optimizer only utilizes them in rare cases afaik. don't know about newer versions like 5.3 or something.

however with mysql implementing loose indexes combined keys on fields where you do range requests or sorting in different directions become more and more relevant.

Due to the use of inequalities, and not equalities, a composite index isn't going to do any much better (if not worse) than two individual indexes.

I'd advocate for leaning towards two individual indexes on both began_at and finished_at.

References for Loose index scan:

http://www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/

http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html

The "Index Merge" strategy could come into play from MySQL 5 onwards: http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html - which also suggests that separate indexes might be better.

However, I have never been able to get it to work for me :)

The correct answer to this sort of question is always: "it depends". Try it both ways and benchmark and compare execution plans. Note that the answer also changes as the amount of data in your tables changes and the query workload changes. Indexes on an evolving system are almost never fire and forget.

Good question, but I'd actually start with indexes on both began_at and finished_at, because one reasonable way you might rewrite this query is like so

SELECT *
  FROM runs
 WHERE began_at < NOW()
   AND finished_at > NOW()

Which makes it a bit more clear (to me) that each column needs its own index.

I've never gotten MySQL to use use two individual indices for begin_at or finished_at when trying to optimize this kind of query. Apparently, other people are saying that a composite index won't work either, so it may just not be possible to get MySQL to optimize this query using an index.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!