when using relation databases and you want 3NF (do you call it 3NF in english?), then you pull 1:1 relationsships together into one table. But what happens if the rationship
Building on your comments with paxdiablo . . .
Let's look at some SQL. I could have chosen better names for the columns, but I deliberately didn't. I wasn't being lazy; I had good reasons. An external predicate is how users are supposed to interpret the contents of a table.
-- External predicate: Human is identified by
-- Social Security Account Number [ssan]
-- and has full name [full_name]
-- and result of last HIV test [hiv_status]
-- and has checking account [bank_account]
-- and was born at exactly [birth_date].
--
create table human (
ssan char(9) primary key,
full_name varchar(35) not null,
hiv_status char(3) not null default 'Unk'
CHECK (hiv_status in ('Unk', 'Pos', 'Neg')),
bank_account varchar(20),
birth_date timestamp not null
);
-- External predicate: Human athlete identified by
-- Social Security Account Number [ssan]
-- has current doping status [doping_status]
create table athlete (
ssan char(9) not null primary key references human (ssan),
doping_status char(3) not null default 'Unk'
CHECK (doping_status in ('Unk', 'Pos', 'Neg'))
);
-- External predicate: Human dictator identified by
-- Social Security Account Number [ssan]
-- has estimated benevolence of [benevolence_score].
create table dictator (
ssan char(9) not null primary key references human (ssan),
benevolence_score integer not null default 3
CHECK (benevolence_score between 1 and 5) -- 1 is least, 5 is most benevolent
);
All three of those tables are in 5NF. (Which means they're also in 3NF.)
You said
there is no "IS A"-relationship in a relational database
An athlete "IS A" human, because its identifier is a human identifier. In this case, its primary key is a foreign key that references human (ssan)
. Database designers don't usually talk in terms of "IS A" and "HAS A" relationships, because predicates are more precise and expressive. You can see the difference by comparing these two statements.
That last one is deliberately a little jarring. I defined the birth_date column as a timestamp--it accommodates both date and time. It illustrates how external predicates are to some extent independent of the column names. (It also illustrates how the loose coupling between predicates and column names might not be such a good idea here.)
You said
But now you nerver get a pure HUMAN but only children of HUMAN
I'm not sure what you mean by "pure human". You can get all the humans by simply
SELECT * FROM human;
If you mean that you can't have a human unless the human is an athlete or dictator (or whatever), then you're mistaken. If there's no row in athlete for a specific SSAN, then the human identified by that SSAN isn't an athlete. If there's no row in dictator for a specific SSAN, then the human identified by that SSAN isn't a dictator.
Third normal form basically means that an attribute (or column) depends on the key, the whole key and nothing but the key (so help me, Codd).
If you have an attribute which is either there or not there, that attribute itself may still follow the rules.
In those cases, I would simply keep the attributes in the main table and make them nullable to indicate whether or not they're appropriate for the row.
By way of (contrived) example, you may have a SocialSecurityNumber
attribute as your primary key (I won't go into the arguments as to whether this is a good idea here or whether you should use a surrogate key since it's irrelevant to the question).
Further assume that you have a distinct BankAccount
attribute for paying their wage into, and that you're not one of those nice employers that can distribute a wage to multiple bank accounts for the purpose of dodging taxes :-)
Now the bank account of someone is dependent entirely on the chosen key but not everyone may possess one (they may be paid in cash). In other words, a classic 1:0/1
case as you put it.
In that case, you would simply make the bank account number nullable in the table.
Based on your question and subsequent comments on @paxdialbo's answer my understanding is you want a solution for storing optional attributes, of which there are many, while avoiding NULLs. Two ways of accomplishing this, 6th Normal Form (6NF) or an Entity Attribute Value (EAV) model.
This involves creating a table specific to the attribute:
create table attributeName (
id
value
)
Where id
is a foreign key and value
captures that attribute (e.g. social security number). Absence of a record for a given key indicates non-existence.
Now as you can imagine, 6th Normal Form can lead to table proliferation. An EAV model solves by using similar model for multiple attributes as such:
create table integerAttribute (
name
id
value
)
The name
column identifies the attribute (e.g. 'SocialSecurity'), albeit a more sophisticated implementation is instead of a name
column, name
is stored in a separate meta data table and referenced via a foreign key. Regardless, this approach implies have other tables for different data types (i.e. datetimeAttribute
, varcharAttribute
, etc...).
The real question to ponder is how many optional attributes you're dealing with. If relatively few, the easiest solution is actually adding optional NULLable columns on the main table. 6NF and EAV add significant complexity and performance concerns. Often what's done when using one of these approaches is serializing the overall entity into a CLOB on the main table to simplify the common read (i.e. by primary key), to avoid multiple LEFT joins to retrieve a fully hydrated entity.