Why do I need “OR NULL” in MySQL when counting rows with a condition

前端 未结 5 660
小蘑菇
小蘑菇 2020-11-30 14:08

There is a question about MySQL\'s COUNT() aggregate function that keeps popping into my head time to time. I would like to get some explanation to why it is working the way

相关标签:
5条回答
  • 2020-11-30 14:47

    It's because COUNT(expression) counts VALUES. In SQL theory, NULL is a STATE, not a VALUE and thus is it not counted. NULL is a state that means that field's value is unknown.

    Now, when you write "value=4" this evaluates to boolean TRUE or FALSE. Since both TRUE and FALSE are VALUES, the result is 10.

    When you add "OR NULL", you actually have "TRUE OR NULL" and "FALSE OR NULL". Now, "TRUE OR NULL" evaluates to TRUE, while "FALSE OR NULL" evaluates to NULL. Thus the result is 3, because you only have 3 values (and seven NULL states).

    0 讨论(0)
  • 2020-11-30 14:56

    I would suggest that you will find the more standard syntax moves better between different database engines and will always give the correct result.

     select count(*)
     from test
     where value = 4
    

    Is the syntax you used a Mysql variant?

    0 讨论(0)
  • 2020-11-30 14:58

    COUNT(expression) counts the number of rows for which the expression is not NULL. The expression value=4 is only NULL if value is NULL, otherwise it is either TRUE (1) or FALSE (0), both of which are counted.

    1 = 4         | FALSE
    4 = 4         | TRUE
    1 = 4 OR NULL | NULL
    4 = 4 OR NULL | TRUE
    

    You could use SUM instead:

    SELECT SUM(value=4) FROM test
    

    This is not particularly useful in your specific example but it can be useful if you want to count rows satisfying multiple different predicates using a single table scan such as in the following query:

    SELECT
        SUM(a>b) AS foo,
        SUM(b>c) AS bar,
        COUNT(*) AS total_rows
    FROM test
    
    0 讨论(0)
  • 2020-11-30 15:05

    This should reveal all

    SELECT 4=4, 3=4, 1 or null, 0 or null
    

    Output

    1   |   0   |   1   |   NULL
    

    Facts

    1. COUNT adds up the columns / expressions that evaluate to NOT NULL. Anything will increment by 1, as long as it is not null. Exception is COUNT(DISTINCT) where it increments only if it is not already counted.

    2. When a BOOLEAN expression is used on its own, it returns either 1 or 0.

    3. When a boolean is OR-ed with NULL, it is NULL only when it is 0 (false)

    To others

    Yes if the count is the ONLY column desired, one could use WHERE value=4 but if it is a query that wants to count the 4's as well as retrieving other counts/aggregates, then the filter doesn't work. An alternative would have been SUM(value=4), e.g.

    SELECT sum(value=4)
      FROM test
    
    0 讨论(0)
  • 2020-11-30 15:05

    COUNT() function accepts an argument, that is treated as NULL or NOT NULL. If it is NOT NULL - then it increments the value, and doesn't do anything otherwise.

    In your case expression value=4 is either TRUE or FALSE, obviously both true and false are not null, that is why you get 10.

    but I am interested in a COUNT(condition) based solution.

    The count-based solution will be always slower (much slower), because it will cause table fullscan and iterative comparison of each value.

    0 讨论(0)
提交回复
热议问题