Which normal form does this table violate?

前端 未结 6 757
北荒
北荒 2021-01-06 06:13

Consider this table:

   +-------+-------+-------+-------+  
   | name  |hobby1 |hobby2 |hobby3 |  
   +-------+-------+-------+-------+   
   | kris  | ball           


        
相关标签:
6条回答
  • 2021-01-06 06:52

    Well,

    The point is that, as long as all hobby1, hobby2 and hobby3 values are not null, AND names are unique, this table could be considered more or less as abbiding by 1NF rules (see here for example ...)

    But does everybody has 3 hobbies? Of course not! Do not forget that databases are basically supposed to hold data as a representation of reality! So, away of all theories, one cannot say that everybody has 3 hobbies, except if ... our table is done to hold data related to people that have three hobbies without any preference between them!

    This said, and supposing that we are in the general case, the correct model could be

    +------------+-------+
    | id_person  |name   |
    +------------+-------+  
    

    for the persons (do not forget to unique key. I do not think 'name' is a good one

    +------------+-------+
    | id_hobby   |name   |
    +------------+-------+ 
    

    for the hobbies. id_hobby key is theoretically not mandatory, as hobby name can be the key ...

    +------------+-----------+
    | id_person  |id_hobby   |
    +------------+-----------+  
    

    for the link between persons and hobbies, as the physical representation of the many-to-many link that exists between persons and their hobbies.

    My proposal is basic, and satisfies theory. It can be improved in many ways ...

    0 讨论(0)
  • 2021-01-06 06:53

    You say that the hobbies belong to the same domain and that they can move around in the columns. If by this you mean that for any specific name the list of hobbies is in an arbitrary order and kriss could just as easily have dance, ball, swim as ball, swim, dance, then I would say you have a repeating group and the table violates 1NF.

    If, on the other hand, there is some fundamental semantic difference between a particular person's first and second hobbies, then there may be an argument for saying that the hobbies are not repeating groups and the table may be in 3NF (assuming that hobby columns are FK to a hobby table). I would suggest that this argument, if it exists, is weak.

    One other factor to consider is why there are precisely 3 hobbies and whether more or fewer hobbies are a potential concern. This factor is important not so much for normalization as for flexibility of design. This is one reason I would split the hobbies into rows, even if they are semantically different from one-another.

    0 讨论(0)
  • 2021-01-06 06:55

    Without knowing what keys exist and what dependencies the table is supposed to satisfy it's impossible to determine for sure what Normal Form it satisfies. All we can do is make guesses based on your attribute names.

    Does the table have a key? Suppose for the sake of example that Name is a candidate key. If there is exactly one value permitted for each of the other attributes for each tuple (which means that no attribute can be null) then the table is in at least First Normal Form.

    0 讨论(0)
  • 2021-01-06 06:55

    If any of the columns in the table accept nulls then then the table violates first normal form. Assuming no nulls, @dportas has already provided the correct answer.

    0 讨论(0)
  • 2021-01-06 06:57

    Your three-hobby table design probably violates what I usually call the spirit of the original 1NF (probably for the reasons given by dportas and others).

    It turns out however, that it is extremely difficult to find [a set of] formal and precise "measurable" criteria that accurately express that original "spirit". That's what your other guy was trying to explain talking about "the ambiguity of repeating groups".

    Stress "formal", "precise" and "measurable" here. Definitions for all other normal forms exist that satisfy "formal", "precise" and "measurable" (i.e. objectively observable). For 1NF it's just hard (/impossible ???) to do. If you want to see why, try this :

    You stated that the question was "whether those three hobby columns constitute a repeating group". Answer this question with "yes", and then provide a rigorous formal underpinning for your answer.

    You cannot just say "the column names are the same, except for the numbered suffix". Making a violation of such a rule objectively observable/measurable would require to enumerate all the possible ways of suffixing.

    You cannot just say "swim, tennis" could equally well be "tennis, swim", because getting to know that for sure requires inspecting the external predicate of the table. If that is just "person <name> has hobby <hobby1> and also has <hobby2>" , then indeed both are equally valid (aside : and due to the closed world assumption it would in fact require all possible permutations of the hobbies to be present in the table !!!). However, if that external predicate is "person <name> spends the most time on <hobby1> and the least on <hobby2>", then "swim, tennis" could NOT equally well be "tennis,swim". But how do you make such interpretations of the external predicate of the table objective (for ALL POSSIBLE PREDICATES) ???

    etc. etc.

    0 讨论(0)
  • 2021-01-06 06:58

    This clearly "looks" like a design error.

    It's not not a design error when this data is simply stored and retrieved. You need only 3 of the hobbies and you don't intend to use this data in any other way than retrieve.

    Let's consider this relationship:

    • Hobby1 is the main hobby at some point in a person's life (before 18 years of age for example)
    • Hobby2 is the hobby at another point (19-30)
    • Hobby3 is her hobby at a another one.

    Then this table seems definitely well designed and while the 1NF convention is respected the naming arguably "sucks".

    In the case of an indiscriminate storage of hobbies this is clearly wrong in most if not all cases I can think of right now. Your table has duplicate rows which goes against the 1NF principles.

    Let's not consider the reduced efficiency of SQL requests to access data from this table when you need to sort the results for paging or any other practical reason.

    Let's take into consideration the effort required to work with your data when your database will be used by another developer or team:

    • The data here is "scattered". You have to look in multiple columns to aggregate related data.
    • You are limited to only 3 of the hobbies.
    • You can't use simple rules to establish unicity (same hobby only once per user).

    You basically create frustration, anger and hatred and the Force is disturbed.

    0 讨论(0)
提交回复
热议问题