I am using the following query:
select count(*) from Table1 where CurrentDateTime>\'2012-05-28 15:34:02.403504\' and Error not in (\'Timeout\',\'Connectio
This :
Error not in ('Timeout','Connection Error');
is semantically equivalent to:
Error <> 'TimeOut' AND Error <> 'Connection Error'
Rules about null comparison applies to IN too. So if the value of Error is NULL, the database can't make the expression true.
To fix, you could do this:
COALESCE(Error,'') not in ('Timeout','Connection Error');
Or better yet:
Error IS NULL OR Error not in ('Timeout','Connection Error');
Or more better yet:
CASE WHEN Error IS NULL THEN 1
ELSE Error not in ('Timeout','Connection Error') THEN 1
END = 1
OR
doesn't short-circuit, CASE can somehow short-circuit your query
Perhaps a concrete example could illustrate why NULL NOT IN expression
returns nothing:
Given this data: http://www.sqlfiddle.com/#!2/0d5da/11
create table tbl
(
msg varchar(100) null,
description varchar(100) not null
);
insert into tbl values
('hi', 'greet'),
(null, 'nothing');
And you do this expression:
select 'hulk' as x, msg, description
from tbl where msg not in ('bruce','banner');
That will output 'hi' only.
The NOT IN is translated as:
select 'hulk' as x, msg, description
from tbl where msg <> 'bruce' and msg <> 'banner';
NULL <> 'bruce'
can't be determined, not even true, not even false
NULL <> 'banner'
can't be determined, not even true not even false
So the null value expression, effectively resolved to:
can't be determined AND can't bedetermined
In fact, if your RDBMS supports boolean on SELECT(e.g. MySQL, Postgresql), you can see why: http://www.sqlfiddle.com/#!2/d41d8/828
select null <> 'Bruce'
That returns null.
This returns null too:
select null <> 'Bruce' and null <> 'Banner'
Given you are using NOT IN
, which is basically an AND expression.
NULL AND NULL
Results to NULL. So it's like you are doing a: http://www.sqlfiddle.com/#!2/0d5da/12
select * from tbl where null
Nothing will be returned
Because null is undefined so null does not equal null. You always have to explicitly handle null.
@Michael Buen ' s answer was the right answer for my case, but let me simplify why.
@Michael says in his post:
Error not in ('Timeout','Connection Error');
is semantically equivalent to:
Error <> 'TimeOut' AND Error <> 'Connection Error'
Rules about null comparison applies to IN too. So if the value of Error is NULL, the database can't make the expression true.
And in [1] I found this sentence which confirms his most important statement for understanding why IN fails with NULL. In the specifications ("specs") in [1] you will: "If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe <=> equality comparison operator."
So yeah, the thing is that sadly Mysql gets lost in such a case. I think Mysql designers shouldn't have done this, because when I compare 2 to NULL, Mysql SHOULD be able to see they are DIFFERENT, and not simply throwing mistaken results. For example, I did:
select id from TABLE where id not in (COLUMN WITH NULLS);
then it throws EMPTY results. BUT. If I do
select id from TABLE where id not in (COLUMN WITH OUT NULLS);
it shows the right result. So when using the IN operator, you must filter out the NULLS. This is not a desired behavior for me as a user, but it's documented in the specifications in [1]. I think that languages and technology should be simpler, in the sense that you should be able to DEDUCE without the need of reading the specs. And truly, 2 is DIFFERENT from NULL, I should be the one in charge of controlling and taking care of mistakes of a higher level of abstraction, but MySQL SHOULD throw a FALSE result when comparing NULL with a specific value.
References for the specs: [1] http://dev.mysql.com/doc/refman/5.6/en/type-conversion.html
IN
returns NULL
if the expression on the left hand side is NULL
. In order to get the NULL
values, you have to do:
select count(*) from Table1 where CurrentDateTime>'2012-05-28 15:34:02.403504' and (Error not in ('Timeout','Connection Error') or Error is null);
Sorry for posting twice in the same forum, but I want to illustrate another example:
I agree with @Wagner Bianchi in [2] in this forum when he says: << It’s very trick when dealing with data and subqueries>>
Moreover, this should NOT be the behavior, I think Mysql's designers are mistaken when they made this decision documented in [1]. The design should be different. Let me explain: You know that when comparing
select (2) not in (1, 4, 3);
you will get:
+----------------------+
| (2) not in (1, 4, 3) |
+----------------------+
| 1 |
+----------------------+
1 row in set (0.00 sec)
BUT if in the list you have at least one NULL then:
select (2) not in (1, NULL, 3);
throws:
+-------------------------+
| (2) not in (1, NULL, 3) |
+-------------------------+
| NULL |
+-------------------------+
1 row in set (0.00 sec)
This is pretty absurd.
We are not the first ones in getting confused by this. See [2]
References:
[1] http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html#function_in
[2] http://blog.9minutesnooze.com/sql-not-in-subquery-null/comment-page-1/#comment-86954
IN
returns a trivalent BOOLEAN
(which accepts NULL
as a value). NOT IN
returns the trivalent negation of IN
, and negation of NULL
is a NULL
.
Imagine we have a table with all numbers from 1
to 1,000,000
in id
and this query:
SELECT *
FROM mytable
WHERE id IN (1, 2, NULL)
or its equivalent:
SELECT *
FROM mytable
WHERE id = ANY
(
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT NULL
)
The predicate returns TRUE
for 1
and 2
and NULL
for all other values, so 1
and 2
are returned.
In its oppposite:
SELECT *
FROM mytable
WHERE id NOT IN (1, 2, NULL)
, or
SELECT *
FROM mytable
WHERE id <> ALL
(
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT NULL
)
, the predicate returns FALSE
for 1
and 2
and NULL
for all other values, so nothing is returned.
Note that boolean negation not only changes the operator (=
to <>
), but the quantifier too (ANY
to ALL
).