Is the following the most efficient in SQL to achieve its result:
SELECT *
FROM Customers
WHERE Customer_ID NOT IN (SELECT Cust_ID FROM SUBSCRIBERS)
One reason why you might prefer to use a JOIN
rather than NOT IN
is that if the Values in the NOT IN
clause contain any NULL
s you will always get back no results. If you do use NOT IN
remember to always consider whether the sub query might bring back a NULL value!
RE: Question in Comments
'x' NOT IN (NULL,'a','b')
≡ 'x' <> NULL and 'x' <> 'a' and 'x' <> 'b'
≡ Unknown and True and True
≡ Unknown
Maybe try this
Select cust.*
From dbo.Customers cust
Left Join dbo.Subscribers subs on cust.Customer_ID = subs.Customer_ID
Where subs.Customer_Id Is Null
Any mature enough SQL database should be able to execute that just as effectively as the equivalent JOIN
. Use whatever is more readable to you.
If you want to know which is more effective, you should try looking at the estimated query plans, or the actual query plans after execution. It'll tell you the costs of the queries (I find CPU and IO cost to be interesting). I wouldn't be surprised much if there's little to no difference, but you never know. I've seen certain queries use multiple cores on our database server, while a rewritten version of that same query would only use one core (needless to say, the query that used all 4 cores was a good 3 times faster). Never really quite put my finger on why that is, but if you're working with large result sets, such differences can occur without your knowing about it.
SELECT Customers.*
FROM Customers
WHERE NOT EXISTS (
SELECT *
FROM SUBSCRIBERS AS s
JOIN s.Cust_ID = Customers.Customer_ID)
When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, the query can use an index within the sub-query.