I am completing normalization exercises from the web to test my abilities to normalize data. This particular problem was found at: https://cs.senecac.on.ca/~dbs201/pages/Normali
In a first step you could create the following tables (assuming pet_id
is unique in the table):
Pets: pet_id, pet_name, pet_type, pet_age, owner
Visits: pet_id, visit_date, procedure
Going further you could split procedure
since the description is repeating:
Pets: pet_id, pet_name, pet_type, pet_age, owner
Visits: pet_id, visit_date, procedure_id
Procedures: procedure_id, description
Although there can be multiple procedures
on the same visit_date
for the same pet_id
, I see no reason to split those further: a date could (in theory) be stored in 2 bytes, and splitting that data would create more overhead (plus an extra index).
You would also want to change pet_age
to pet_birth_date
since the age changes over time.
Since this is the first exercise in your list, the above will probably be more than enough.
Going even further:
An owner
can have multiple pets, so another table could be created:
Pet_owners: owner_id, owner_name
and then only use owner_id
in the Pets
table. In a real system there would be customer_id, name, address, phone, email
, etc. - so that should always be in a separate table.
You could even do the same for pet_type
and store the id
in 1 or 2 bytes, but it all depends on the type of queries you want to do later on the data.
The question is poorly presented. Look at the last two columns. The askers do not mean that each column's types are sets. They mean that pairs of values on the same line make an element of a set. They should have had one column whose values were triplets--date, number & name. That's what they did when they used just one column (the last one) for number & name. Notice that their solution in the pdf linked to by the page you link to has a table that has all three of date, number & name.
But how are you supposed to know that the values should be paired? After all if the date column gave the set of a pet's visit dates & the procedure column gave the set of procedure number & names a pet ever had then we wouldn't be supposed to take a pair of values on the same line as an element of a set. Unfortunately you are just supposed to magically guess correctly. (A hint is that the number of dates & number-name pairs for a pet are always the same.)
The above took the blank areas in the illustration to be there to make room for the vertical display of set-valued attributes; the portrayed table has 4 rows. But maybe they are there because you are supposed to get a relation from this illustration by interpreting a blank subrow as representing the most recent non-blank subrow. Then the table wouldn't have any set-valued columns; the portrayed table has 9 rows. It happens that this interpretation disagrees with the linked answer's UNF & 1NF sections.
If they weren't going to explain the table & were just relying on your guesses it would have been clearer if they put a visit's procedure date, number & name under one column--just as they put a procedure number & name in one column. But really, they should always tell you how to read the illustration. But really, you should always ask how read an illustration. If you have any interpretation conventions from a related course/textbook then you should have put it in your question for us to know.
Unfortunately "UNF" tables are almost always similarly poorly given without any description about how they are to be interpreted. Also "1NF" has no standard meaning & there is no standard notion of "normalizing to 1NF".