问题
Question
How can I avoid primary key clashes in client/server databases
Background
I am synchronizing a number of databases with each other. I have one central SQL Server database and many client SQL Server databases. Now say I have table X in the database which have a primary key ID.
In the first client we have the following IDs in table X:
X (client 1)
--------
1|SomeValue
2|SomeValue
3|SomeValue
4|SomeValue
and in the second client I have
X (client 2)
--------
1|SomeValue
2|SomeValue
3|SomeValue
4|SomeValue
When I synchronize, I want the clients to only upload their data and not to download them.
Now when I synchronize the first client with the server, it will add the primary keys from (1 .. 4). However, when I synchronize with the second client, there will be PRIMARY KEY CLASH. How can I solve this problem?
I am using SQL Server 2008 R2, Sync Framework and C#.
I have already considered the idea of using GUID as a primary key which is not viable in my case because I am dealing with a legacy database. Also, the idea of reseeding the IDENTITY value which is somewhat error prone. How can I increment an identity column without inserting a value? P.S: the primary keys are set to be IDENTITY with increment of 1.
回答1:
To support distributed clients capable of inserting records, the schema has to support client DBs creating rows without conflicts. This means either using GUIDs for PKs, or a concatenated key. (ID + ClientID) *either way, schema change.
Manually synchronising client databases otherwise means either checking for conflicting IDs before insert or handling the exception then replacing IDs or allowing identities to be generated for the conflicted records. This means updating all FK relationships. Time consuming and error prone.
回答2:
There are a few ways for this to work. The first is if there is a guaranteed uniquekey by which both the upstream and downstream databases can refer to that record. In the case of SQL server the key type you want is a UNIQUEIDENTIFIER (aka GUID).
Now, if for some reason you can't use a UNIQUEIDENTIFIER, the second best option is to add an additional column for location or something similar. This column would identify the individual database where the record is coming from. For example, the first client might have a location ID of 1, the second 2, etc. You would then modify your existing primary key to be a compound key of this location id and your current ID.
A third option is to not modify the client databases at all, but only add the location id column to tables within the "master" database. If you are manually synchronizing in code, then the code needs to take into account where the record is coming from and add in the id as appropriate. This might be difficult, but is certainly possible.
Baring that you would have to reseed the client databases to start their identity values at a particular point. For example client 1 might get values 1 to 10000 and client 2 might get 10001 through 20000. However, you can see the potential for failure here. Not just if the client exceeds your allowed range but also if you setup additional clients and someone screws up the seed values. Also, it would be a total PITA to fix all existing clients.
As a side note, when building an application that you know is going to have a distributed database that needs merging it is almost universally wise to simply use GUIDs as the primary key. Yes it fragments indexes etc but in this case those downsides are preferable to the complete PITA and possible errors that could occur when you finally merge all that data together.
回答3:
further to @Chris's third option, if you're using Sync Framework, you can actually "trick" it into adding an extra column for the server side PK.
see: Part 1 – Upload Synchronization where the Client and Server Primary Keys are different
来源:https://stackoverflow.com/questions/14171987/dealing-with-primary-key-clashs-in-client-database-when-synchronizing-with-the-s