Database Design Primay Key, ID vs String

前端 未结 5 1503
执笔经年
执笔经年 2021-02-01 16:11

I am currently planning to develop a music streaming application. And i am wondering what would be better as a primary key in my tables on the server. An ID int or a Unique Stri

相关标签:
5条回答
  • 2021-02-01 16:44

    This is in large part a matter of personal preference.

    My personal opinion and practice is to always use integer keys and to always use surrogate rather than natural keys (so never use anything like social security number or the genre name directly).

    There are cases where an auto number field is not appropriate or does not scale. In these cases it can make sense to use a GUID, which can be a string in databases that do not have a native datatype for it.

    0 讨论(0)
  • 2021-02-01 16:45

    From the software perspective the GUID is better as its unique globally.

    Quotes from: Primary Keys: IDs versus GUIDs

    Using a GUID as a row identity value feels more natural-- and certainly more truly unique-- than a 32-bit integer. Database guru Joe Celko seems to agree. GUID primary keys are a natural fit for many development scenarios, such as replication, or when you need to generate primary keys outside the database. But it's still a question of balancing the tradeoffs between traditional 4-byte integer IDs and 16-byte GUIDs:

    GUID Pros

    • Unique across every table, every database, every server
    • Allows easy merging of records from different databases
    • Allows easy distribution of databases across multiple servers
    • You can generate IDs anywhere, instead of having to roundtrip to the database
    • Most replication scenarios require GUID columns anyway

    GUID Cons

    • It is a whopping 4 times larger than the traditional 4-byte index value; this can have serious performance and storage implications if you're not careful
    • Cumbersome to debug where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
    • The generated GUIDs should be partially sequential for best performance (eg, newsequentialid() on SQL 2005) and to enable use of clustered indexes
    0 讨论(0)
  • 2021-02-01 16:47

    You are doing the right thing - identity field should be numeric and not string based, both for space saving and for performance reasons (matching keys on strings is slower than matching on integers).

    0 讨论(0)
  • 2021-02-01 16:50

    Is there any reason this isn't a good idea? Is there anything I should be aware of?

    Yes. Integer IDs are very bad if you need to uniquely identify the same data outside of a single database. For example, if you have to copy the same data into another database system with potentially pre-existing data or you have a distributed database. The biggest thing to be aware of is that an integer like 7481 has no meaning outside of that one database. If later on you need to grow that database, it may be impossible without surgically removing your data.

    The other thing to keep in mind is that integer IDs aren't as flexible so they can't easily be used for exceptional cases. The designers of the Internet Protocol understood this and took precautions by allocating certain blocks of numbers as "special" in one way or another (broadcast IPs, private IPs, network IPs). But that was only possible because there's a protocol surrounding the usage of those numbers. Many databases don't operate within such a well-defined protocol.

    FWIW, it's kind of like trying to decide if having a "strongly typed" programming paradigm is better than a "weakly/dynamically typed" programming paradigm. It will depend on what you need to do.

    0 讨论(0)
  • 2021-02-01 16:50

    My recommendation is: use ids.

    You'll be able to rename that "Genre" with 20000 songs without breaking anything.

    The idea behind this is that the id identifies the row in the table. Whatever the row has is something that doesn't matters in this problem.

    0 讨论(0)
提交回复
热议问题