When to use one field as primary key instead of 2?

后端 未结 4 1056
借酒劲吻你
借酒劲吻你 2020-12-12 03:11

I often see some database design like this:

Case 1:

UserTable

--id[auto increase]

--UserName

--Password

--Em

相关标签:
4条回答
  • 2020-12-12 03:30

    Several reason I can think of in your example for using a surrogate primary key (Id) over the username.

    1. The id field would be very rarely subject to updates if at all. If username was the primary key you would have to cascade on update to all tables where username was used as a foreign key.
    2. Performance. An int comparison beats a string comparison.
    3. The id key would take up less storage space where it was a foreign key in other tables.
    4. the id field allows you to not expose perhaps sensitive data. E.g. consider a web app url domain/posts/user/1242 vs domain/posts/user/myusername

    For your second question it would be better to use userid than the username in UserTableRole. Whether or not it is better to then also include a surrogate key for this many- to- many table is a matter of opinion. I hate using surrogate id keys for many to many tables and usually just make a compound primary key of the two foreign key ids. The only time I would consider a surrogate key here is if I needed to use it as a foreign key in yet another table.

    0 讨论(0)
  • 2020-12-12 03:34

    One reason I can think of for not using things like UserName as the primary key is that they could be subject to change. Having anything that's exposed to the outside world as a primary key runs the risk of those things being changed, and it's best to have a stable primary key.

    What if the user changes an email or username; do you really want to change your keys in all your relationships? IMO, it's best to have a stable key that never sees the outside world, about which everyone knows nothing, and therefore which can remain stable regardless of what changes may occur in your database.

    0 讨论(0)
  • 2020-12-12 03:45

    Your question is essentially the advantages and disadvantages of using natural vs surrogate key.

    Flexibility is the primary concern, with surrogates key you can change their username much more easily. And it might be possible in the future that you may need to allow duplicate usernames, e.g. mergers.

    Speed is another concern, on a frequently accessed table like the user table, it's generally faster to do a join on integers than on strings.

    Another is table size, when used as foreignkey, you'll have to store the whole key's value. Surrogates are very compact, and is much smaller than natural keys.

    Most ORM also requires the use of surrogate because it provides consistency between tables.

    Also, on many systems, it may not necessarily be safe to assume that email is unique.

    I agree though that in a relationship table like UserRole, it's generally best to use a primary composite key from the foreign keys.

    0 讨论(0)
  • 2020-12-12 03:53

    In Case 1: Why not use UserName field as primary key (PK)? why use another filed likes id [which is auto increased] as PK?

    The UserTable.UserName has intrinsic meaning in this data model and is called "natural key". The UserTable.id, on the other hand, is "surrogate key".

    If there is a natural key in your model, you cannot eliminate it with the surrogate key, you can just supplant it. So the question is: do you just use the natural key, or the natural and surrogate key? Both strategies are actually valid and have their pros and cons.

    Typical reasons for surrogate key:

    • To keep FKs in child tables slimmer (integer vs. string in this case), for smaller storage and better caching.
    • Avoid the need for ON UPDATE CASCADE.
    • Friendliness toward ORM tools.

    On the other hand:

    • You now have two keys instead of one, requiring an extra index, making the parent table larger and less cache-friendly, and slowing down INSERT/UPDATE//DELETE due to index maintenance.1
    • May require more JOIN-ing2.
    • And may not play well with clustering.3

    In case of just UserName and Email, why not use Email as PK?

    The designer probably wanted to avoid ON CASCADE UPDATE that would be necessary if user changed the e-mail.

    In Case 2: In the UserRoleTable, why not use both UserName and RoleID as PK?

    If there cannot be multiple connections for the same user/role pair, you have to have a key on that in any case.

    Unless there are child tables with FKs referencing UserTableRole or an unfriendly ORM is used, there is no reason for an additional surrogate PK.


    1 And if clustering is used, the secondary index under the natural key may be extra "fat" (since it contains a copy of the clustering key, which is typically PK) and require a double-lookup when querying (since rows in clustered table don't have stable physical locations, so must be located through a clustering key, barring some DBMS-specific optimizations such as Oracle's "rowid guesses").

    2 E.g. you wouldn't be able to find UserName just by reading the junction table - you'd have to JOIN it with the UserTable.

    3 Surrogates are typically ordered in a way that is not meaningful to the client applications. The auto-increment surrogate key's order depends on the order of INSERTs, and querying is not typically done on a "range of users by their order of insertion". Some surrogates such as GUIDs may be more-less randomly ordered.

    0 讨论(0)
提交回复
热议问题