How to model a database with many m:n relations on a table

后端未结

关注

 5  1264

I am currently setting up a database which has a large number of many-to-many relations. Every relationship was modeled via a link table. Example:

A person has a num

相关标签:

5条回答

醉酒成梦

2021-01-02 07:05

In my humble opinion I would go for the first model. It's probably a more complex model but in the end it will make things easier when you're extracting info from tables and the application code could get dirtier or more unreadable for other programmers. Beside, there are some authors that wouldn't reccommend to use multipurpose tables like that.

In the end you must go with whatever suits you better. We don't know the whole context so can't help you too much to decide. But, for what you're saying and I'd definitely go for option number one.

0 讨论(0)
发布评论:

提交评论
- 加载中...
日久生厌

2021-01-02 07:07

The second model is a problem from several perspectives. First it is likely to create blocking issues as everything goes to the one meta table. Second it is far more likely to have data integrity issues as you can't enforce the foreign key constraints. It is a SQL antipattern to model that way. The first model was correct.

0 讨论(0)
发布评论:

提交评论
- 加载中...
余生分开走

2021-01-02 07:10

Your simplified version does not represent a proper relational model. It's more of a metadata model.

The number of tables in your database should represent the number of logical entities in your domain. That should not change based on some arbitrary idea of how many entities is too many.

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2021-01-02 07:11
Your design violates Fourth Normal Form. You're trying to store multiple "facts" in one table, and it leads to anomalies.

The Person_Attributes table should look something like this: personId jobId houseId restaurantId

So if I associate with one job, one house, but two restaurants, do I store the following?
```
personId jobId houseId restaurantId
    1234    42      87         5678
    1234    42      87         9876
```
And if I add a third restaurant, I copy the other columns?
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234   123      87         9876
    1234    42      87        13579 
```
Done! Oh, wait, what happened there? I changed jobs at the same time as adding the new restaurant. Now I'm incorrectly associated with two jobs, but there's no way to distinguish between that and correctly being associated with two jobs.

Also, even if it is correct to be associated with two jobs, shouldn't the data look like this?
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234   123      87         9876
    1234   123      87        13579 
    1234    42      87         5678
    1234    42      87         9876
    1234    42      87        13579 
```
It starts looking like a Cartesian product of all distinct values of jobId, houseId, and restaurantId. In fact, it is -- because this table is trying to store multiple independent facts.

Correct relational design requires a separate intersection table for each many-to-many relationship. Sorry, you have not found a shortcut.

(Many articles about normalization say the higher normal forms past 3NF are esoteric, and one never has to worry about 4NF or 5NF. Let this example disprove that claim.)

Re your comment about using NULL: Then you have a problem enforcing uniqueness, because a PRIMARY KEY constraint requires that all columns be NOT NULL.
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         9876
    1234  NULL    NULL        13579 
```
Also, if I add a second house or a second jobId to the above table, which row do I put it in? You could end up with this:
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         9876
    1234    42    NULL        13579 
```
Now if I disassociate restaurantId 9876, I could update it to NULL. But that leaves a row of all NULLs, which I really should just delete.
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         NULL
    1234    42    NULL        13579 
```
Whereas if I had disassociated restaurant 13579, I could update it to NULL and leave the row in place.
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234  NULL    NULL         9876
    1234    42    NULL         NULL 
```
But shouldn't I consolidate rows, moving the jobId to another row, provided there's a vacancy in that column?
```
personId jobId houseId restaurantId
    1234   123      87         5678
    1234    42    NULL         9876
```
The trouble is, now it's getting more and more complex to add or remove associations, requiring multiple SQL statements for changes. You're going to have to write a lot of tedious application code to handle this complexity.

However, all the various changes are easy if you define one table per many-to-many relationship. You do need the complexity of having that many more tables, but by doing that you will simplify your application code.

Adding an association to a restaurant is simply an INSERT to the Person_Restaurant table. Removing that association is simply a DELETE. It doesn't matter how many associations there are to jobs or houses. And you can define a primary key constraint in each of these intersection tables to enforce uniqueness.
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦毁少年i

2021-01-02 07:29
I do not think the second method is correct because your Person_Attributes table would contain redundant data. For example: say a person likes 10 restaurants and works on 2 jobs, has 3 houses you would have as many as 10*2*3 entries where it should be 10 + 2 + 3(in 3 link tables...as per approach#1). Think of drawbacks having million users and if you had more than 3 attributes in Person_Attributes table to handle... so I would go with approach 1 in your question.

Say for example your Person_Attributes table has following entry:
```
personId | houseId | jobId | restaurantId
------------------------------------------
P1      H1  J1  R1
```
now if the person likes restaurants R2 and R3...table looks like
```
P1      H1      J1      R1
P2      H1      J1      R2
P2      H1      J1      R3
```
table already has redundant data he adds Job J2 at a later point.. your table will look like
```
P1      H1      J1      R1
P2      H1      J1      R2
P2      H1      J1      R3
P1      H1      J2      R1
P2      H1      J2      R2
P2      H1      J2      R3
```
Now consider he adds another home H2.. so on and so forth...Do you see my point?
0 讨论(0)
发布评论:

提交评论
- 加载中...