I sometimes interchanged the use of NOT IN
and NOT EXIST
in my sql queries and both yield the same result. Is the logic behind the NOT EXIST
Not Exist and Not in works same way.
For example:
CREATE TABLE Test ( Id INT NOT NULL );
INSERT INTO Test VALUES ( 1 );
INSERT INTO Test VALUES ( 2 );
INSERT INTO Test VALUES( 3 );
CREATE TABLE Bar ( Foo INT NULL );
INSERT INTO Bar VALUES( NULL );
INSERT INTO Bar VALUES ( 1 );
INSERT INTO Bar VALUES( 2 );
Executing:
SELECT Id
FROM Test
WHERE Id NOT IN( SELECT Foo FROM Bar WHERE foo=id );
---> It will give the output as 3
And executing:
SELECT Id
FROM Test
WHERE NOT EXISTS( SELECT foo FROM Bar WHERE Foo = Id );
---> It will also give the output as 3
This article may be of interest to you:
In a nutshell, NOT IN
is slightly different from NOT EXISTS
in the way the two handle NULL
values returned by the subquery.
If there are no NULL
values, they both perform a kind on NESTED LOOP ANTI JOIN
, but NOT IN
is a little bit more efficient.
No, they are not the same. The IN function translates to a series of OR statements. Typically this trips people up when they use a subquery in the IN function and that query returns at least one null value. E.g. Col Not In( Select Foo From Bar )
. This compares Col <> Foo
for each row. If one of the values of Foo is null, you get Col <> NULL
and the entire statement returns false resulting in no rows.
Exists simply determines if any rows in the subquery are returned (Not Exists being that no rows can be returned). The Select clause is entirely ignored.
Example:
Create Table Test ( Id int not null )
Insert Test ( 1 )
Insert Test ( 2 )
Insert Test ( 3 )
Create Table Bar ( Foo int null )
Insert Bar ( Null )
Insert Bar ( 1 )
Insert Bar ( 2 )
The following will result in no rows since we are effectively doing the comparison of Not( Id = Null Or Id = 1 Or Id = 2 )
Select Id
From Test
Where Id Not In( Select Foo From Bar )
The following however will return one row
Select Id
From Test
Where Not Exists( Select 1 From Bar Where Foo = Id )
In the above query, I will get Id = 3.
Both will give you the same results. The NOT EXISTS is a correlated sub-query, it is joined to the main query. In the example below, the NOT EXISTS should perform faster on a large dataset.
Find products without orders.
select *
from Products
where ProductId NOT IN (select ProductId FROM Orders)
select *
from Products
where NOT EXISTS (select 1 FROM Orders WHERE orders.ProductId = Products.Id)