I have query like this:
SELECT `table_1`.* from `table_1`
INNER JOIN `table_2` [...]
INNER JOIN `table_3` [...]
WHERE `table_1`.`id` IN(
SELECT `id` FR
SELECT `table_1`.*
FROM `table_1`
INNER JOIN
`table_2` [...]
INNER JOIN
`table_3` [...]
WHERE `table_1`.`id` IN
(
SELECT `id`
FROM [...]
)
AND [more conditions]
If the inner table is properly indexed, the subquery here is not being "performed" at all in a strict sense of word.
Since the subquery is a part of an IN
expression, the condition is pushed into the subquery and it's transformed into an EXISTS
.
In fact, this subquery is evaluated on each step:
EXISTS
(
SELECT NULL
FROM [...]
WHERE id = table1.id
)
You can actually see it in the detailed description provided by EXPLAIN EXTENDED
.
That's why it's called DEPENDENT SUBQUERY
: the result of each evaluation depends on the value of table1.id
. The subquery as such is not correlated, it's the optimized version that is correlated.
MySQL
always evaluates the EXISTS
clause after the more simple filters (since they are much easier to evaluate and there is a probability that the subquery won't be evaluated at all).
If you want the subquery to be evaluated all at once, rewrite the query as this:
SELECT table_1.*
FROM (
SELECT DISTINCT id
FROM [...]
) q
JOIN table_1
ON table_1.id = q.id
JOIN table_2
ON [...]
JOIN table_3
ON [...]
WHERE [more conditions]
This forces the subquery to be leading in the join, which is more efficient if the subquery is small compared to table_1
, and less efficient if the subquery is large compared to table_1
.
If there is an index on [...].id
used in the subquery, the subquery will be performed using an INDEX FOR GROUP-BY
.