Should I have a dedicated primary key field?

后端 未结 11 1526
温柔的废话
温柔的废话 2020-11-30 10:05

I\'m designing a small SQL database to be used by a web application.

Let\'s say a particular table has a Name field for which no two rows will be allowed to have the

相关标签:
11条回答
  • 2020-11-30 10:26

    I would use a generated PK myself, just for the reasons you mentioned. Also, indexing and comparing by integer is faster than comparing by strings. You can put a unique index on the name field too without making it a primary key.

    0 讨论(0)
  • 2020-11-30 10:26

    If your name column will be changing it isn't really a good candidate for a primary key. A primary key should define a unique row of a table. If it can be changed it's not really doing that. Without knowing more specifics about your system I can't say, but this might be a good time for a surrogate key.

    I'll also add this in hopes of dispelling the myths of using auto-incrementing integers for all of your primary keys. It is NOT always a performance gain to use them. In fact, quite often it's the exact opposite. If you have an auto-incrementing column that means that every INSERT in the system now has that added overhead of generating a new value.

    Also, as Mark points out, with surrogate IDs on all of your tables if you have a chain of tables that are related, to get from one to another you might have to join all of those tables together to traverse them. With natural primary keys that is usually not the case. Joining 6 tables with integers is going to usually be slower than joining 2 tables with a string.

    You also often loose the ability to do set-based operations when you have auto-incrementing IDs on all of your tables. Instead of insert 1000 rows into a parent table, then inserting 5000 rows into a child table, you now have to insert the parent rows one at a time in a cursor or some other loop just to get the generated IDs so that you can assign them to the related children. I've seen a 30 second process turned into a 20 minute process because someone insisted on using auto-incrementing IDs on all of the tables in a database.

    Finally (at least for reasons I'm listing here - there are certainly others), using auto-incrementing IDs on all of your tables promotes poor design. When the designer no longer has to think about what a natural key might be for a table it usually results in erroneous duplicates ending up in the data. You can try to avoid the problem with unique indexes, but in my experience developers and designers don't go through that extra effort and after a year of using their new system they find that the data is a mess because the database didn't have proper constraints on the data through natural keys.

    There's certainly a time for using surrogate keys, but using them blindly on all tables is almost always a mistake.

    0 讨论(0)
  • 2020-11-30 10:29

    Yes - and as a rule of thumb, always, for every table.

    You should definitely not use a changeable field as a primary key and in the vast majority of circumstances you don't want to use a field that has any other purpose as a primary key.

    This is basic good practice for db schemas.

    0 讨论(0)
  • 2020-11-30 10:31

    The primary key for a record must be unique and permanent. If a record naturally has a simple key which fulfills both of those, then use it. However, they don't come around very often. For a person record, the person's name is neither unique nor permanent, so you pretty much have to use a auto-increment.

    The one place where natural keys do work is on a code table, for example, a table mapping a status value to its description. There is little sense to give "Active" a primary key of 1, "Delay" a primary key of 2, etc. When it is just as easy to give "Active" a primary key of "ACT"; "Delayed", "DLY"; "On Hold", "HLD" and so on.

    Note also, some say you should use integers over strings because they compare faster. Not really true. A comparing two 4-byte character fields will take exactly as long as comparing two 4-byte integer fields. Longer string will, of course take longer, but if you keep the codes short, there's no difference.

    0 讨论(0)
  • 2020-11-30 10:39

    In addition to what is all said, consider using a UUID as PK. It will allow you to create keys that are uniq spanning multiple databases.

    If you ever need to export/merge data with other database, then the data will always stay unique and relationships can be easily maintained.

    0 讨论(0)
  • 2020-11-30 10:40

    The primary key must be unique for every row. The auto_increment Integer is very good idea, and if you don't have other ideas about populating the primary key then this is the best way.

    0 讨论(0)
提交回复
热议问题