Which one is the best choice for primary key in SQL Server?
There are some example code:
Uniqueidentifiers
e.g.
One thing you'll need to consider in designing your tables is if you'll need to replicate, shard, or otherwise move your data from one place to another. Maybe the data is being generated by other applications and which will need to be kept in sync with yours. An example of that would be a mobile app that creates data and then syncs it with a server. If anything like that is or might be true then UNIQUEIDENTIFIER
would the good choice use to use for your primary key.
The UNIQUEIDENTIFIER
data type is terrible for performance when used as a clustered index. Yes, you could use newsequentialid()
, but that doesn't help you if the values are generated on other devices. The consensus seems to be that clustered indexes are best used with a sequential and narrow data type like an INT
or BIGINT
.
If you're not concerned with storage space issues then you might try using a combination of both an IDENTITY
cluster key and UNIQUEIDENTIFIER
primary key. Create a cluster key IDENTITY
column and use it for your clustered index (but not as a primary key). Inserts will still be made sequentially and it satisfies the desire for it to be a narrow data type. Now you can use a UNIQUEIDENTIFIER
as your primary key. This will allow you to move, replicate, and/or shard your data when you need to.
The cluster key has no other purpose other than to keep your inserts sequential and to be what all the other non-clustered indexes point to when looking up data for a given query. The cluster key is completely throw away and can be regenerated when data is moved, replicated, and/or sharded since uniqueness is handled by the UNIQUEIDENTIFIER
primary key.
Here is a great article that demonstrates what happens internally when using an IDENTITY vs a UNIQUEIDENTIFIER for your clustered index.