I have two tables:
CREATE TABLE `articles` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(1000) DEFAULT NULL,
`last_updated` datetime DEFAULT NULL
If you have lots of categories, this query cannot be made efficient. No single index can cover two tables at once in MySQL
.
You have to do denormalization: add last_updated
, has_comments
and deleted
into article_categories
:
CREATE TABLE `article_categories` (
`article_id` int(11) NOT NULL DEFAULT '0',
`category_id` int(11) NOT NULL DEFAULT '0',
`last_updated` timestamp NOT NULL,
`has_comments` boolean NOT NULL,
`deleted` boolean NOT NULL,
PRIMARY KEY (`article_id`,`category_id`),
KEY `category_id` (`category_id`),
KEY `ix_articlecategories_category_comments_deleted_updated` (category_id, has_comments, deleted, last_updated)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
and run this query:
SELECT *
FROM (
SELECT article_id
FROM article_categories
WHERE (category_id, has_comments, deleted) = (78, 1, 0)
ORDER BY
last_updated DESC
LIMIT 100, 20
) q
JOIN articles a
ON a.id = q.article_id
Of course you should update article_categories
as well whenever you update relevant columns in article
. This can be done in a trigger.
Note that the column has_comments
is boolean: this will allow using an equality predicate to make a single range scan over the index.
Also note that the LIMIT
goes into the subquery. This makes MySQL
use late row lookups which it does not use by default. See this article in my blog about why do they increase performance:
If you were on SQL Server, you could make an indexable view over your query, which essentially would make a denormalized indexed copy of article_categories
with the additional fields, automatically mainained by the server.
Unfortunately, MySQL
does not support this and you will have to create such a table manually and write additional code to keep it in sync with the base tables.