In SQL (MSSQL, Oracle, etc., whatever), when joining tables, what is the gain from adding a filter to the JOIN statement instead of having it in the WHERE clause?
i.
For INNER JOIN
queries, the performance characteristics of these filters will depend on many factors - the size of the tables, indexing, the selectivity of the query, and other factors specific to the RDBMS on which the query is executed.
In LEFT
and RIGHT OUTER JOIN
, the position of the filter matters much more than INNER JOIN
, since affects whether it will be applied before (JOIN
clause) or after (WHERE
clause) the join is carried out.
As a human (rather than an optimizer) myself, when maintaining a query I would look for a join condition in the JOIN
clause and a search condition in the WHERE
clause.
Of course, you need to strike a balance between performance issues and code maintenance issues. However, my first priority is good logical code in the first instance then optimize as necessary.
Nothing, except clarity and meaning. Unless you have outer joins.
I sometimes do this in queries that have a lot of joins because it localises all the information about the join in one part of the query rather than having some in the join condition and some in the where clause.
These syntaxes are synonymous and are optimized to the same thing by most RDBMS
.
I usually prefer this syntax:
SELECT *
FROM X
INNER JOIN
Y
ON X.A = Y.A
WHERE X.B = 'SOMETHING'
when B
is not a part of the logical link between A
and B
, and this one:
SELECT *
FROM X
INNER JOIN
Y
ON X.A = Y.A
AND X.B = 'SOMETHING'
when it is.
For an INNER JOIN, I would not expect a performance difference, but rather that the same plan would be used whether the filter was in the JOIN...ON clause or the WHERE clause. I personally prefer to use write the join criteria in the JOIN clause and the filtering in the WHERE clause- a sort of way to stick all the "parameters" to the SQL statement in the same place- this isn't necessarily sensible or well-thought-out. Conversely, some people like to have everything in the JOIN clause to keep everything together.
The situation with outer joins is different- there is a significant difference between "a LEFT OUTER JOIN b ON a.a_id=b.a_id AND b.type = 1" and "a LEFT OUTER JOIN b ON a.a_id=b.a_id WHERE b.type=1"- in fact the latter is implicitly forcing an inner join. This would be another reason to put all such conditions in the JOIN clause, for consistency.