Why MYSQL IN keyword not considering NULL values

一曲冷凌霜 提交于 2019-12-28 06:22:29

问题


I am using the following query:

select count(*) from Table1 where CurrentDateTime>'2012-05-28 15:34:02.403504' and Error not in ('Timeout','Connection Error');

Surprisingly, this statement doesnot include the rows having Error value as NULL.My intention is to filter only rows with Error value as 'Timeout' (or) 'Connection Error'. I need to give an additional condition( OR Error is NULL) to retrieve the correct result.

Why is MYSQL filtering out results with NULL values? I thought that IN keyword would return a boolean result (1/0) and now i understand that some MYSQL keywords doesnt return boolean values,it might return NULL too....but Why is it treating NULL as special?


回答1:


This :

Error not in ('Timeout','Connection Error');

is semantically equivalent to:

Error <> 'TimeOut' AND Error <> 'Connection Error'

Rules about null comparison applies to IN too. So if the value of Error is NULL, the database can't make the expression true.

To fix, you could do this:

COALESCE(Error,'') not in ('Timeout','Connection Error');

Or better yet:

Error IS NULL OR Error not in ('Timeout','Connection Error');

Or more better yet:

 CASE WHEN Error IS NULL THEN 1
 ELSE Error not in ('Timeout','Connection Error') THEN 1
 END = 1

OR doesn't short-circuit, CASE can somehow short-circuit your query


Perhaps a concrete example could illustrate why NULL NOT IN expression returns nothing:

Given this data: http://www.sqlfiddle.com/#!2/0d5da/11

create table tbl
(
  msg varchar(100) null,
  description varchar(100) not null
  );


insert into tbl values
('hi', 'greet'),
(null, 'nothing');

And you do this expression:

select 'hulk' as x, msg, description 
from tbl where msg not in ('bruce','banner');

That will output 'hi' only.

The NOT IN is translated as:

select 'hulk' as x, msg, description 
from tbl where msg <> 'bruce' and msg <> 'banner';

NULL <> 'bruce' can't be determined, not even true, not even false

NULL <> 'banner' can't be determined, not even true not even false

So the null value expression, effectively resolved to:

can't be determined AND can't bedetermined

In fact, if your RDBMS supports boolean on SELECT(e.g. MySQL, Postgresql), you can see why: http://www.sqlfiddle.com/#!2/d41d8/828

select null <> 'Bruce' 

That returns null.

This returns null too:

select null <> 'Bruce' and null <> 'Banner'

Given you are using NOT IN, which is basically an AND expression.

NULL AND NULL

Results to NULL. So it's like you are doing a: http://www.sqlfiddle.com/#!2/0d5da/12

select * from tbl where null

Nothing will be returned




回答2:


Because null is undefined so null does not equal null. You always have to explicitly handle null.




回答3:


IN returns NULL if the expression on the left hand side is NULL. In order to get the NULL values, you have to do:

select count(*) from Table1 where CurrentDateTime>'2012-05-28 15:34:02.403504' and (Error not in ('Timeout','Connection Error') or Error is null);



回答4:


IN returns a trivalent BOOLEAN (which accepts NULL as a value). NOT IN returns the trivalent negation of IN, and negation of NULL is a NULL.

Imagine we have a table with all numbers from 1 to 1,000,000 in id and this query:

SELECT  *
FROM    mytable
WHERE   id IN (1, 2, NULL)

or its equivalent:

SELECT  *
FROM    mytable
WHERE   id = ANY
             (
             SELECT  1
             UNION ALL
             SELECT  2
             UNION ALL
             SELECT  NULL
             )

The predicate returns TRUE for 1 and 2 and NULL for all other values, so 1 and 2 are returned.

In its oppposite:

SELECT  *
FROM    mytable
WHERE   id NOT IN (1, 2, NULL)

, or

SELECT  *
FROM    mytable
WHERE   id <> ALL
             (
             SELECT  1
             UNION ALL
             SELECT  2
             UNION ALL
             SELECT  NULL
             )

, the predicate returns FALSE for 1 and 2 and NULL for all other values, so nothing is returned.

Note that boolean negation not only changes the operator (= to <>), but the quantifier too (ANY to ALL).




回答5:


@Michael Buen ' s answer was the right answer for my case, but let me simplify why.

@Michael says in his post:


Error not in ('Timeout','Connection Error');

is semantically equivalent to:

Error <> 'TimeOut' AND Error <> 'Connection Error'

Rules about null comparison applies to IN too. So if the value of Error is NULL, the database can't make the expression true.

And in [1] I found this sentence which confirms his most important statement for understanding why IN fails with NULL. In the specifications ("specs") in [1] you will: "If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe <=> equality comparison operator."

So yeah, the thing is that sadly Mysql gets lost in such a case. I think Mysql designers shouldn't have done this, because when I compare 2 to NULL, Mysql SHOULD be able to see they are DIFFERENT, and not simply throwing mistaken results. For example, I did:

select id from TABLE where id not in (COLUMN WITH NULLS);

then it throws EMPTY results. BUT. If I do

select id from TABLE where id not in (COLUMN WITH OUT NULLS);

it shows the right result. So when using the IN operator, you must filter out the NULLS. This is not a desired behavior for me as a user, but it's documented in the specifications in [1]. I think that languages and technology should be simpler, in the sense that you should be able to DEDUCE without the need of reading the specs. And truly, 2 is DIFFERENT from NULL, I should be the one in charge of controlling and taking care of mistakes of a higher level of abstraction, but MySQL SHOULD throw a FALSE result when comparing NULL with a specific value.

References for the specs: [1] http://dev.mysql.com/doc/refman/5.6/en/type-conversion.html




回答6:


Sorry for posting twice in the same forum, but I want to illustrate another example:

I agree with @Wagner Bianchi in [2] in this forum when he says: << It’s very trick when dealing with data and subqueries>>

Moreover, this should NOT be the behavior, I think Mysql's designers are mistaken when they made this decision documented in [1]. The design should be different. Let me explain: You know that when comparing

select (2) not in (1, 4, 3);
    you will get:
        +----------------------+
        | (2) not in (1, 4, 3) |
        +----------------------+
        |                    1 |
        +----------------------+
        1 row in set (0.00 sec)

BUT if in the list you have at least one NULL then:

select (2) not in (1, NULL, 3);
    throws:
        +-------------------------+
        | (2) not in (1, NULL, 3) |
        +-------------------------+
        |                    NULL |
        +-------------------------+
        1 row in set (0.00 sec)
    This is pretty absurd.

We are not the first ones in getting confused by this. See [2]

References:

[1] http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html#function_in

[2] http://blog.9minutesnooze.com/sql-not-in-subquery-null/comment-page-1/#comment-86954



来源:https://stackoverflow.com/questions/10810391/why-mysql-in-keyword-not-considering-null-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!