I\'m curious which of the following below would be more efficient?
I\'ve always been a bit cautious about using IN
because I believe SQL Server turns th
There are many misleading answers answers here, including the highly upvoted one (although I don't believe their ops meant harm). The short answer is: These are the same.
There are many keywords in the (T-)SQL language, but in the end, the only thing that really happens on the hardware is the operations as seen in the execution query plan.
The relational (maths theory) operation we do when we invoke [NOT] IN
and [NOT] EXISTS
is the semi join (anti-join when using NOT
). It is not a coincidence that the corresponding sql-server operations have the same name. There is no operation that mentions IN
or EXISTS
anywhere - only (anti-)semi joins. Thus, there is no way that a logically-equivalent IN
vs EXISTS
choice could affect performance because there is one and only way, the (anti)semi join execution operation, to get their results.
An example:
Query 1 ( plan )
select * from dt where dt.customer in (select c.code from customer c where c.active=0)
Query 2 ( plan )
select * from dt where exists (select 1 from customer c where c.code=dt.customer and c.active=0)