SQL “select where not in subquery” returns no results

后端未结

关注

 11  1855

Disclaimer: I have figured out the problem (I think), but I wanted to add this issue to Stack Overflow since I couldn\'t (easily) find it anywhere. Also, someone might

相关标签:

11条回答

时光取名叫无心

2020-11-29 15:39

SELECT T.common_id FROM Common T LEFT JOIN Table1 T1 ON T.common_id = T1.common_id LEFT JOIN Table2 T2 ON T.common_id = T2.common_id WHERE T1.common_id IS NULL AND T2.common_id IS NULL

0 讨论(0)

发布评论:

提交评论

加载中...

暗喜

2020-11-29 15:40

Update:

These articles in my blog describe the differences between the methods in more detail:

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: SQL Server

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: PostgreSQL

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: Oracle

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL

There are three ways to do such a query:

LEFT JOIN / IS NULL:

SELECT * FROM common LEFT JOIN table1 t1 ON t1.common_id = common.common_id WHERE t1.common_id IS NULL

NOT EXISTS:

SELECT * FROM common WHERE NOT EXISTS ( SELECT NULL FROM table1 t1 WHERE t1.common_id = common.common_id )

NOT IN:

SELECT * FROM common WHERE common_id NOT IN ( SELECT common_id FROM table1 t1 )

When table1.common_id is not nullable, all these queries are semantically the same.

When it is nullable, NOT IN is different, since IN (and, therefore, NOT IN) return NULL when a value does not match anything in a list containing a NULL.

This may be confusing but may become more obvious if we recall the alternate syntax for this:

common_id = ANY ( SELECT common_id FROM table1 t1 )

The result of this condition is a boolean product of all comparisons within the list. Of course, a single NULL value yields the NULL result which renders the whole result NULL too.

We never cannot say definitely that common_id is not equal to anything from this list, since at least one of the values is NULL.

Suppose we have these data:

common -- 1 3 table1 -- NULL 1 2

LEFT JOIN / IS NULL and NOT EXISTS will return 3, NOT IN will return nothing (since it will always evaluate to either FALSE or NULL).

In MySQL, in case on non-nullable column, LEFT JOIN / IS NULL and NOT IN are a little bit (several percent) more efficient than NOT EXISTS. If the column is nullable, NOT EXISTS is the most efficient (again, not much).

In Oracle, all three queries yield same plans (an ANTI JOIN).

In SQL Server, NOT IN / NOT EXISTS are more efficient, since LEFT JOIN / IS NULL cannot be optimized to an ANTI JOIN by its optimizer.

In PostgreSQL, LEFT JOIN / IS NULL and NOT EXISTS are more efficient than NOT IN, sine they are optimized to an Anti Join, while NOT IN uses hashed subplan (or even a plain subplan if the subquery is too large to hash)

0 讨论(0)

发布评论:

提交评论

加载中...

生来不讨喜

2020-11-29 15:43

Let's suppose these values for common_id:

Common - 1 Table1 - 2 Table2 - 3, null

We want the row in Common to return, because it doesn't exist in any of the other tables. However, the null throws in a monkey wrench.

With those values, the query is equivalent to:

select * from Common where 1 not in (2) and 1 not in (3, null)

That is equivalent to:

select * from Common where not (1=2) and not (1=3 or 1=null)

This is where the problem starts. When comparing with a null, the answer is unknown. So the query reduces to

select * from Common where not (false) and not (false or unkown)

false or unknown is unknown:

select * from Common where true and not (unknown)

true and not unkown is also unkown:

select * from Common where unknown

The where condition does not return records where the result is unkown, so we get no records back.

One way to deal with this is to use the exists operator rather than in. Exists never returns unkown because it operates on rows rather than columns. (A row either exists or it doesn't; none of this null ambiguity at the row level!)

select * from Common where not exists (select common_id from Table1 where common_id = Common.common_id) and not exists (select common_id from Table2 where common_id = Common.common_id)

0 讨论(0)

发布评论:

提交评论

加载中...

夕颜

2020-11-29 15:47

If you want the world to be a two-valued boolean place, you must prevent the null (third value) case yourself.

Don't write IN clauses that allow nulls in the list side. Filter them out!

common_id not in ( select common_id from Table1 where common_id is not null )

0 讨论(0)

发布评论:

提交评论

加载中...

闹比i

2020-11-29 15:49

Just off the top of my head...

select c.commonID, t1.commonID, t2.commonID from Common c left outer join Table1 t1 on t1.commonID = c.commonID left outer join Table2 t2 on t2.commonID = c.commonID where t1.commonID is null and t2.commonID is null

I ran a few tests and here were my results w.r.t. @patmortech's answer and @rexem's comments.

If either Table1 or Table2 is not indexed on commonID, you get a table scan but @patmortech's query is still twice as fast (for a 100K row master table).

If neither are indexed on commonID, you get two table scans and the difference is negligible.

If both are indexed on commonID, the "not exists" query runs in 1/3 the time.

0 讨论(0)

发布评论:

提交评论

加载中...

上一页 1 2

验证码

看不清?

提交回复