Let\'s say you have a table Orders with a foreign key to a Customer Id. Now, suppose you want to add an Order without a Customer Id, (whether that should be possible is anot
I can't see anything wrong with that it is just an optional n-1 relationship that will be represented with a null in the foreign-key. Otherwise if you put your link table then you'll have to manage that it doesn't become a n-n relationship, so causing even more trouble.
Having the link table is probably a better option. At least it does not violate normalization BCNF (Boyce-Codd normal form). however I would favor being pragmatic. If you have very few of these null values and they are only temporary I think you should skip the link table since it only adds complexity to the scheme.
On a side note; using a link table doesn't necessarily make it n to n, if you in the link table use the foreign key that's pointing to your orders table as the primary key in that link table the relationship is still 1..n. There can only be one entry in that link table per order.
Nullable FKs for optional many-to-one relations are totally fine.
Using NULL would be a good way to clean up incomplete orders:
SELECT * FROM `orders`
WHERE `started_time` < (UNIX_TIMESTAMP() + 900) AND `customer_id` IS NULL
The above would show orders older than 15 minutes without a related customer ID.
Nullable columns can be in 1NF through 5NF, but not in 6NF according to what I've read.
Only if you know better than Chris Date "what first normal form really means". If x and y are both nullable, and indeed in some row x and y are both null
, then WHERE x=y
does not yield true
. This proves beyond reasonable doubt that null is not a value (because any real value is always equal to itself). And since the RM prescribes that "there must be a value in every cell of a table", any thing that possibly contains nulls, is not a relational thing, and thus the question of 1NF doesn't even arise.
I've heard it argued that Nullable columns in general break the first degree of normalization.
See above for the sound reason underlying that argument.
But in practice it's very practical.
Only if you're immune to the headaches that it usually causes in the entire rest of the world. One such headache (and it's only a minor one, comparatively to other null
phenomenons) is the fact that WHERE x=y
in SQL actually means WHERE x is not null and y is not null and x=y
, but that most programmers simply aren't aware of that fact and just read over it. Sometimes without any harm, other times not.
In fact, nullable columns violate one of the most fundamental database design rules : don't combine distinct information elements in one column. Nulls do exactly that because they combine the boolean value "this field is/is not really present" with the actual value.
Optional relationships are definitely possible in the relational model.
You can use nulls to express the absence of a relationship. They are convenient, but they will cause you the same headaches that nulls cause you elsewhere. One place where they don't cause any trouble is joins. Rows that have a null in the foreign key don't match any rows in the referenced table. So they drop out of an inner join. If you do outer joins, you are going to be dealing with nulls anyway.
If you really want to avoid nulls (6th normal form), you can decompose the table. One of the two decomposed tables has two foreign key columns. One is the optional foreign key you have, and the other is a foreign key referencing the primary key of the original table. Now you have to use constraints to prevent the relationship from becoming many-to-many, it you want to prevent that.