In every tutorial on entity relationship diagrams, I read that specifying a fixed cardinality for a relationship is not allowed. Only an informal comment on the ERD may clarify that the number of pilots is exactly 2
.
So, for example, a relationship between flights and pilots where each flight has exactly 2 pilots present, would have to be represented as:
<flight> 0..N <------> 1..N <pilot>
rather than
<flight> 0..N <------> 2 <pilot>
My notation is 0..N
= optional, many; 1..N
= mandatory, many, 1
= mandatory, one.
Is this restriction universal? What's the reason behind it?
EDIT: clarified my notation.
EDIT: I can see how two relationships would enforce the same constraint:
0..N <------> 1
<flight> <pilot>
0..N <------> 1
But then a query to see if a pilot is on a given flight becomes really ugly, as you'll have to check each of two attributes. And if the number of attributes grows (say, to 15 flight attendants), the queries become completely unmanageable and the schema only barely manageable.
The other responses have provided a few valuable pieces of the answer. Two more pieces need to be added:
First, ER modeling is more than just an ERD. We tend to try to put the entire ER model on one diagram. But complete ER modeling is a lot more than what will fit on a single diagram. There can be business rules that limit the cardinality of a relationship to no less than 10 and no more than 15. But it's important to realize that these must be "Business rules" (i.e. subject matter rules) and not design restrictions imposed for practical reasons. A complete ER model can include all of these business rules on the data and these can be expressed, if necessary, in plain English.
The notation 10..15 is to be preferred because it's more concise, unless more detail is needed to clarify the rule, such as the reason why the rule exists.
The above hints at the second point that needs to be made. It's the difference between analysis and design. If ER modeling is used in the classical manner, it's a tool for data analysis and not a tool for database design. By "data analysis", I mean problem analysis from a data centric point of view. Distinguishing between analysis and design, between features of the problem and features of the solution, is something that is not taught enough in formal CS or IT education. It's absolutely critical to getting things right.
And even those of us who are aware of the difference sometimes slip up and slide features of the solution into the definition of the problem. This is known as "thinking inside of the box".
If you want to diagram the database design, don't use an ERD. Use a relational schematic diagram, provided that the database you are designing is relational. A relational schematic includes features that an ERD ought not to include, like junction tables and foreign keys. Don't use ERD as "relational lite". That's not what it is.
Incidentally, another answer made the comment that an ERD ought to be implementable on any DBMS. That's a consequence of the concept I've just presented, that the ERD captures analysis and not design.
Cardinality rules are themselves only a special case of "any and all possible rules in general". The only two languages that are capable of expressing "any and all possible rules", are human natural language (which has the downside of being often ambiguous and imprecise no matter how hard you try), and the symbolic language of formal predicate logic.
How to use the latter in a context of data modeling, is the entire topic of the superb (and highly commendedable) book "Applied Mathematics for Database Professionals".
Halpin's ORM is an attempt to come up with a modeling language that can cover (i.e. has graphical symbols to express) more kinds of business rules than E/R can. For example, it has symbols for expressing acyclic graph constraints ("no person is an ancestor of himself"). But even this language cannot express everything, and must necessarily resort to a final class of constraints it calls "others", and which can only be described using natural language.
It's a language problem. If you devise a language with an extremely small number of symbols (rectangle, connecting line, zero, one, and crowfoot), that can be combined in only an extremely small number of ways, then you cannot expect such a language to be able to express just anything conceivable.
You have unique indexes ( like primary keys ) and non-unique indexes: which is having 1 record or multiple records allowed according to your unique keys definition
Then
You have null fields or fields forced to have value: Here you may have 0 or 1 value
Combine these 2 things and you should be able to know, why they always say 0 or more
ERD cannot always force RULES, rules usually can be forced by DB design but it will always need another layer from software (or at least stored procedures)
By the way, regarding your example, if you always have 2 pilots in every flight, and you want to enforce this rule by db design, you can simply create 2 relations from Pilots table to Flight table, yes, you will have 2 foreign keys to the same table, ERD allows that
E-R
diagrams provide a way to indicate constraints on the number of times each entity participates in a relationship.
An edge between an entity set and a binary relationship set can have an associated minimum and maximum cardinality as: min...max
A min
value of 1
indicates total participation. A max
value of 1
indicates at most one participation, a max
of *
or N
indicates no upper limit.
Now coming to your question. Your modeling is wrong to begin with because a pilot would be in many flights and not one as you seem to imply.
Also if each flight has exactly 2 pilots and both are required then this is not modeled as a 1-N
relationship but as 2 attributes of a flight (the pilot id) since they would never be null or optional (can not have a flight without 2 pilots).
So an upper limit of a constant number shows some problem in the design which is not generic enough to model your system.
Not a direct answer to your question, but you might be interested how to enforce this kind of constraint in an actual database...
Let's say you can have at most 2 pilots on a flight. You can simply create two "0..N to 0..1" relationships:
And if you want exactly 2 pilots, just make PILOT1_ID and PILOT2_ID NOT NULL (and ensure they are different, through a CHECK).
However, as you already noted, this quickly becomes unwieldy for greater cardinalities, so a different technique is warranted. Let's say you need to limit the number of flight attendants to 15, you can do it like this...
...with the following constraint on the junction table:
CHECK (POSITION BETWEEN 1 AND 15)
Note that there is a UNIQUE constraint on {FLIGHT_ID, POSITION}, as denoted by U1
in the diagram above.
Essentially, we are pigeon-holing attendants to specific positions per each flight. No two attendants can occupy the same position for the same flight (thanks to the U1
), so since there are exactly 15 positions per flight, there cannot be more than 15 attendants per flight.
Unfortunately, there is no good way to force that all positions are filled, so there can still be less than 15 attendants per flight. If that's important, you'll likely have to enforce it from the application code.
--- EDIT ---
To look for a free position (to use for the next INSERT), even if it's a "hole" between the two already-filled positions, you can do something like this (replace 1
with the desired FLIGHT_ID):
SELECT DISTINCT *
FROM (
SELECT POSITION + 1 FREE_POSITION
FROM FLIGHT_ATTENDANT
WHERE FLIGHT_ID = 1
UNION ALL
SELECT POSITION - 1 FREE_POSITION
FROM FLIGHT_ATTENDANT
WHERE FLIGHT_ID = 1
)
WHERE
FREE_POSITION NOT IN (
SELECT POSITION
FROM FLIGHT_ATTENDANT
WHERE FLIGHT_ID = 1
)
ORDER BY FREE_POSITION;
This query could have the following results:
- It could returns no rows, indicating all positions are free so just pick any one within the valid range.
- It could return rows, first and last of which may be out of the valid range, so ignore them if they are. Use one of the remaining rows. If there are no remaining rows, this is the indication all positions have been filled.
Here is a working SQL Fiddle example under Oracle, but the same technique should applicable to any DBMS. There are more elegant, DBMS specific ways for doing these kinds of queries (especially if you want all free positions, not just some), but even this "generic" solution should be more than enough for most practical purposes...
I've experienced a similar problem. My scenario was having two football teams partaking in one fixture. So the two tables were "Team" and "Fixture". In the fixture table, I just had a "team_home" column and a "team_away" column. I'm not sure if this is the best or even correct way to do it, but it worked for me.
How did you implement your solution max?
来源:https://stackoverflow.com/questions/12445467/why-is-a-specific-cardinality-not-allowed-in-the-erd