I often see some database design like this:
Case 1:
UserTable
--id[auto increase]
--UserName
--Password
--Em
In Case 1: Why not use UserName field as primary key (PK)? why use another filed likes id [which is auto increased] as PK?
The UserTable.UserName
has intrinsic meaning in this data model and is called "natural key". The UserTable.id
, on the other hand, is "surrogate key".
If there is a natural key in your model, you cannot eliminate it with the surrogate key, you can just supplant it. So the question is: do you just use the natural key, or the natural and surrogate key? Both strategies are actually valid and have their pros and cons.
Typical reasons for surrogate key:
On the other hand:
In case of just UserName and Email, why not use Email as PK?
The designer probably wanted to avoid ON CASCADE UPDATE that would be necessary if user changed the e-mail.
In Case 2: In the UserRoleTable, why not use both UserName and RoleID as PK?
If there cannot be multiple connections for the same user/role pair, you have to have a key on that in any case.
Unless there are child tables with FKs referencing UserTableRole
or an unfriendly ORM is used, there is no reason for an additional surrogate PK.
1 And if clustering is used, the secondary index under the natural key may be extra "fat" (since it contains a copy of the clustering key, which is typically PK) and require a double-lookup when querying (since rows in clustered table don't have stable physical locations, so must be located through a clustering key, barring some DBMS-specific optimizations such as Oracle's "rowid guesses").
2 E.g. you wouldn't be able to find UserName
just by reading the junction table - you'd have to JOIN it with the UserTable
.
3 Surrogates are typically ordered in a way that is not meaningful to the client applications. The auto-increment surrogate key's order depends on the order of INSERTs, and querying is not typically done on a "range of users by their order of insertion". Some surrogates such as GUIDs may be more-less randomly ordered.