Why doesn't SQL support “= null” instead of “is null”?

前端 未结 12 2015
不知归路
不知归路 2020-11-29 11:11

I\'m not asking if it does. I know that it doesn\'t.

I\'m curious as to the reason. I\'ve read support docs such as as this one on Working With Nu

相关标签:
12条回答
  • 2020-11-29 11:22

    It's a language semantic.

    Null is the lack of a value.

    is null makes sense to me. It says, "is lacking a value" or "is unknown". Personally I've never asked somebody if something is, "equal to lacking a value".

    0 讨论(0)
  • 2020-11-29 11:24

    The concept is that NULL is not an equitable value. It denotes the absence of a value.

    Therefore, a variable or a column can only be checked if it IS NULL, but not if it IS EQUAL TO NULL.

    Once you open up arithmetic comparisions, you may have to contend with IS GREATER THAN NULL, or IS LESS THAN OR EQUAL TO NULL

    0 讨论(0)
  • 2020-11-29 11:24

    In addition to all that has already been said, I wish to stress that what you write in your first line is wrong. SQL does support the “= NULL” syntax, but it has a different semantic than “IS NULL” – as can be seen in the very piece of documentation you linked to.

    0 讨论(0)
  • 2020-11-29 11:26

    a. Null is not the "lack of a value"

    b. Null is not "empty"

    c. Null is not an "unset value"

    It's all of the above and none of the above.

    By technical rights, NULL is an "unknown value". However, like uninitialized pointers in C/C++, you don't really know what your pointing at. With databases, they allocate the space but do not initialize the value in that space.

    So, it is an "empty" space in the sense that it's not initialized. If you set a value to NULL, the original value stays in that storage location. If it was originally an empty string (for example), it will remain that.

    It's a "lack of a value" in the fact that it hasn't been set to what the database deems a valid value.

    It's an "unset value" in that if the space was just allocated, the value that is there has never been set.

    "Unknown" is the closest that we can truly come to knowing what to expect when we examine a NULL.


    Because of that, if we try to compare this "unknown" value, we will get a comparison that

    a) may or may not be valid

    b) may or may not have the result we expect

    c) may or may not crash the database.

    So, the DBMS systems (long ago) decided that it doesn't even make sense to use equality when it comes to NULL.

    Therefore, "= null" makes no sense.

    0 讨论(0)
  • 2020-11-29 11:27

    I agree with the OP that

    where column_name = null
    

    should be syntactic sugar for

    where column_name is null
    

    However, I do understand why the creators of SQL wanted to make the distinction. In three-valued logic (IMO this is a misnomer), a predicate can return two values (true or false) OR unknown which is technically not a value but just a way to say "we don't know which of the two values this is". Think about the following predicate in terms of three-valued logic:

    A == B
    

    This predicate tests whether A is equal to B. Here's what the truth table looks like:

        T U F
        -----
    T | T U F
    U | U U U
    F | F U T
    

    If either A or B is unknown, the predicate itself always returns unknown, regardless of whether the other one is true or false or unknown.

    In SQL, null is a synonym for unknown. So, the SQL predicate

    column_name = null
    

    tests whether the value of column_name is equal to something whose value is unknown, and returns unknown regardless of whether column_name is true or false or unknown or anything else, just like in three-valued logic above. SQL DML operations are restricted to operating on rows for which the predicate in the where clause returns true, ignoring rows for which the predicate returns false or unknown. That's why "where column_name = null" doesn't operate on any rows.

    0 讨论(0)
  • 2020-11-29 11:28

    I can't help but feel that you're still not satisfied with the answers that have been given so far, so I thought I'd try another tack. Let's have an example (no, I've no idea why this specific example has come into my head).

    We have a table for employees, EMP:

    EMP
    ---
    EMPNO           GIVENNAME
    E0001           Boris
    E0002           Chris
    E0003           Dave
    E0004           Steve
    E0005           Tony
    

    And, for whatever bizarre reason, we're tracking what colour trousers each employee chooses to wear on a particular day (TROUS):

    TROUS
    -----
    EMPNO       DATE        COLOUR
    E0001       20110806    Brown
    E0002       20110806    Blue
    E0003       20110806    Black
    E0004       20110806    Brown
    E0005       20110806    Black
    E0001       20110807    Black
    E0003       20110807    Black
    E0004       20110807    Grey
    

    I could go on. We write a query, where we want to know the name of every employee, and what colour trousers they had on on the 7th August:

    SELECT e.GIVENNAME,t.COLOUR
    FROM
        EMP e
            LEFT JOIN
        TROUS t
            ON
                 e.EMPNO = t.EMPNO and
                 t.DATE = '20110807'
    

    And we get the result set:

    GIVENNAME       COLOUR
    Chris           NULL
    Steve           Grey
    Dave            Black
    Boris           Black
    Tony            NULL
    

    Now, this result set could be in a view, or CTE, or whatever, and we might want to continue asking questions about these results, using SQL. What might some of these questions be?

    1. Were Dave and Boris wearing the same colour trousers on that day? (Yes, Black==Black)

    2. Were Dave and Steve wearing the same colour trousers on that day? (No, Black!=Grey)

    3. Were Boris and Tony wearing the same colour trousers on that day? (Unknown - we're trying to compare with NULL, and we're following the SQL rules)

    4. Were Boris and Tony not wearing the same colour trousers on that day? (Unknown - we're again comparing to NULL, and we're following SQL rules)

    5. Were Chris and Tony wearing the same colour trousers on that day? (Unknown)

    Note, that you're already aware of specific mechanisms (e.g. IS NULL) to force the outcomes you want, if you've designed your database to never use NULL as a marker for missing information.

    But in SQL, NULL has been given two roles (at least) - to mark inapplicable information (maybe we have complete information in the database, and Chris and Tony didn't turn up for work that day, or did but weren't wearing trousers), and to mark missing information (Chris did turn up that day, we just don't have the information recorded in the database at this time)

    If you're using NULL purely as a marker of inapplicable information, I assume you're avoiding such constructs as outer joins.


    I find it interesting that you've brought up NaN in comments to other answers, without seeing that NaN and (SQL) NULL have a lot in common. The biggest difference between them is that NULL is intended for use across the system, no matter what data type is involved.

    You're biggest issue seems to be that you've decided that NULL has a single meaning across all programming languages, and you seem to feel that SQL has broken that meaning. In fact, null in different languages frequently has subtly different meanings. In some languages, it's a synonym for 0. In others, not, so the comparison 0==null will succeed in some, and fail in others. You mentioned VB, but VB (assuming you're talking .NET versions) does not have null. It has Nothing, which again is subtly different (it's the equivalent in most respects of the C# construct default(T)).

    0 讨论(0)
提交回复
热议问题