I appreciate the semantic meaning of a NULL value in a database table, different from both false and the empty string \'\'. However, I have often read about performance pro
The empty string should not be used in place of NULL
. NULL
represents nothing where as the empty string is something, with nothing inside. NULL
will always be false when compared to another value (even NULL
) and NULL
will not be summed in the COUNT
function.
If you need to represent unknown information there is no substitute to NULL
.
As @ForYourOwnGood said - Null should be used for "unknown" information. For example: If you have a lot of fields that customer should fill-in on registration and some of them are optional. For some reason you might want to reserve an ID for that particular customer and since you don't know if the optional fields are a real choice by the customer to be left empty you should set them as NULL i.e. "unknown" when you first save the row. If the customer submits the form, passes all your validation and then you save the information, then you know for fact that the optional field is left empty by intention.
This is just a good case of using NULL.
The main benefit, of course, is the semantic meaning of NULL, which you mentioned.
In addition to that -- and it may depend on your storage engine, as always, check the documentation -- but in at least some databases, NULLs take up a lot less room than a regular value. For example, if you have a "varchar" column declared to be 20 characters, and it's rarely filled in, you can save a lot of disk space by making it NULL instead of an empty string.
I have never heard of any performance issues with using NULLs, one the opposite. I've heard of people mucking up their counts because they counted NULLs wrong, but never performance. If that's a real thing, I would love to hear about it!
Generally, if an attribute is required, it is defined as Not NULL and if it may be omitted it is defined as nullable.
The meaning of a NULL column is more or less "doesn't apply in this context". I generally use NULL columns in two cases:
closed_at
and is_closed
), I just create the closed_at column and set it to NULL if the inventory set can still be changed, but set the date once it's closed. Basically it boils down to the fact that I use NULL when the emptyness of a field has a different unique semantic than just an empty field. The absence of a middle initial is just that. The absence of a closing date has the meaning of the inventory set still being open to changes.
NULL values can have nasty side effects and they will make life harder for you to add data to the table and more often than not, you can end up with a mish-mash of NULL values and empty strings for example.
Also, NULL is not equal to anything, which will screw queries all over the place if you are not very careful.
Personally, I use NULL columns only when one of the above two cases applies. I never use it to signify empty fields when the emptyness has no meaning other than the absence of a value.