What is the difference between the EXISTS
and IN
clause in SQL?
When should we use EXISTS
, and when should we use IN
If you are using the IN operator, the SQL engine will scan all records fetched from the inner query. On the other hand if we are using EXISTS, the SQL engine will stop the scanning process as soon as it found a match.
Based on rule optimizer:
EXISTS
is much faster than IN
, when the sub-query results is very large.IN
is faster than EXISTS
, when the sub-query results is very small.Based on cost optimizer:
In certain circumstances, it is better to use IN rather than EXISTS. In general, if the selective predicate is in the subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS.
https://docs.oracle.com/cd/B19306_01/server.102/b14211/sql_1016.htm#i28403
IN:
EXISTS:
EXISTS
is much faster than IN
when the subquery results is very large.
IN
is faster than EXISTS
when the subquery results is very small.
CREATE TABLE t1 (id INT, title VARCHAR(20), someIntCol INT)
GO
CREATE TABLE t2 (id INT, t1Id INT, someData VARCHAR(20))
GO
INSERT INTO t1
SELECT 1, 'title 1', 5 UNION ALL
SELECT 2, 'title 2', 5 UNION ALL
SELECT 3, 'title 3', 5 UNION ALL
SELECT 4, 'title 4', 5 UNION ALL
SELECT null, 'title 5', 5 UNION ALL
SELECT null, 'title 6', 5
INSERT INTO t2
SELECT 1, 1, 'data 1' UNION ALL
SELECT 2, 1, 'data 2' UNION ALL
SELECT 3, 2, 'data 3' UNION ALL
SELECT 4, 3, 'data 4' UNION ALL
SELECT 5, 3, 'data 5' UNION ALL
SELECT 6, 3, 'data 6' UNION ALL
SELECT 7, 4, 'data 7' UNION ALL
SELECT 8, null, 'data 8' UNION ALL
SELECT 9, 6, 'data 9' UNION ALL
SELECT 10, 6, 'data 10' UNION ALL
SELECT 11, 8, 'data 11'
Query 1
SELECT
FROM t1
WHERE not EXISTS (SELECT * FROM t2 WHERE t1.id = t2.t1id)
Query 2
SELECT t1.*
FROM t1
WHERE t1.id not in (SELECT t2.t1id FROM t2 )
If in t1
your id has null value then Query 1 will find them, but Query 2 cant find null parameters.
I mean IN
can't compare anything with null, so it has no result for null, but EXISTS
can compare everything with null.
IN supports only equality relations (or inequality when preceded by NOT).
It is a synonym to =any / =some, e.g
select *
from t1
where x in (select x from t2)
;
EXISTS supports variant types of relations, that cannot be expressed using IN, e.g. -
select *
from t1
where exists (select null
from t2
where t2.x=t1.x
and t2.y>t1.y
and t2.z like '℅' || t1.z || '℅'
)
;
The allegedly performance and technical differences between EXISTS and IN may result from specific vendor's implementations/limitations/bugs, but many times they are nothing but myths created due to lack of understanding of the databases internals.
The tables' definition, statistics' accuracy, database configuration and optimizer's version have all impact on the execution plan and therefore on the performance metrics.