Is Guid the best identity datatype for Databases?

前端 未结 8 1816
眼角桃花
眼角桃花 2020-12-24 09:13

It is connected to BI and merging of data from different data sources and would make that process more smooth.

And is there an optimal migration strategy from a data

相关标签:
8条回答
  • 2020-12-24 09:42

    Edited after reading Frans Bouma's answer, since my answer has been accepted and therefore moved to the top. Thanks, Frans.

    GUIDs do make a good unique value, however due to their complex nature they're not really human-readable, which can make support difficult. If you're going to use GUIDs you might want to consider doing some performance analysis on bulk data operations before you make your choice. Take into account that if your primary key is "clustered" then GUIDs are not appropriate.

    This is because a clustered index causes the rows to be physically re-ordered in the table on inserts/updates. Since GUIDs are random, every insert would require actual rows in the table to be moved to make way for the new row.

    Personally I like to have two "keys" on my data:

    1) Primary key
    Unique, numeric values with a clustered primary key. This is my system's internal ID for each row, and is used to uniquely identify a row and in foreign keys.

    Identity can cause trouble if you're using database replication (SQL Server will add a "rowguid" column automatically for merge-replicated tables) because the identity seed is maintained per server instance, and you'd get duplicates.

    2) External Key/External ID/Business ID
    Often it is also preferable to have the additional concept of an "external ID". This is often a character field with a unique constraint (possibly including another column e.g. customer identifier).

    This would be the value used by external interfaces and would be exposed to customers (who do not recognise your internal values). This "business ID" allows customers to refer to your data using values that mean something to them.

    0 讨论(0)
  • 2020-12-24 09:42

    There is no "best" identity datatype. The various options have different strengths and weaknesses. I use GUIDs more often than not, but I have to deal regularly with disconnected clients and merge replication, so the choice is appropriate. If you don't have to deal with replication (i.e. the situation where a user adds new records while disconnected from the central database), an auto-incrementing int field is the better choice.

    0 讨论(0)
  • 2020-12-24 09:44

    The following project may be of some use or at least inspire you to solve this problem.

    https://github.com/twitter/snowflake

    0 讨论(0)
  • 2020-12-24 09:45

    Anything that can uniquely identify the record is a good identity data type. GUID is generally good, but it's not the optimum identity if you actually have a unique id coming from the source data. GUID is a random integer value that's guaranteed to be unique; however, in an integration situation, you often want to detect duplicates of information, not just match up the records.

    0 讨论(0)
  • 2020-12-24 09:48

    I used to not like GUID at all, but I've grown to love it. I love it because it is relatively uniform and adopted, and I end up writing less code by using it, and maintaining that code, than I would normally write and maintain.

    It is especially useful for storage of files, where you need to guarantee that a filename is unique, in a directory with a potentially large number of files, including pre-existing files.

    0 讨论(0)
  • 2020-12-24 09:50

    GUID are better in data replication scenarios, with the "identity" approach you have to be careful about not cause collisions between the data being replicated between Databases. Hope this helps.

    0 讨论(0)
提交回复
热议问题