How to merge two databases in SQL Server?

谁说我不能喝 提交于 2019-11-29 14:08:03

问题


Both databases have the same schema, but they may experience conflict with primary key in some tables. So I want them to just ignore the duplicate rows, and continue merging further.


回答1:


First a conflict of keys indicates that whatever process you are currently using is a poor one.

To correctly merge two database which are using autogenerated (non_GUID) keys, you need to take several steps. First add a new autogenerated key to the parent table, then import all the data from both tables, rename the old the old is file to ID_old and rename the new files to the old id name. At this point you can then move onthe the child tables. You will need to copy to child tables by joining to the parent table and taking the new id field as the value for the foreign key instead of the one in the existing table. You will need to repeat this process for every foreign key table and if that table is also a parent table, you will need to add the conversionid field to the table before copying any data, so that you you can work all the way down the chain. To do this properly involves a great deal of of knowlege of the structure of the database and lots of planning. Do not consider doing this without a good backup of both source databases. It is also best if the process can happen when both dabases are in single user mode.

If you use natural keys and have duplicates, you have a far different problem. All duplicate key records whould be moved to a separate table first and a detemination as to which is the more correct data should be made. In some cases you will find that the natural key is in fact not unique (they rarely are which is why I almost nver use them) and the merged database will need to work with an autogenerated key of some type. This will involve code changes as well as database changes, so it is the option of last resort.

What you find often with natural keys is that the data for each one is different but simliar (St. vice Street in the address) in this case mark one of the records for insert and then when do the insert in two steps, first the records which have no duplicates, then the records in the duplicates table that are marked for insertion. Remember you will have to examine all records in all foreign key tables to make the determination which to keep and which not to keep. Just throwing out any duplicates is a bad idea and you will lose data that way, possibly critical data (such as a customer's orders). This is a long tedious process which will require someone with expertise in the data to make the determinations. As a programmer, you should provide them a dedup tool that will let them examine all the data for each set of duplicates and choose what to keep and what to get rid of and then having marked everyithing, it will run a process to insert the records. Remeber in your design, that for true duplicates, there will be some child tables (such as orders ) that need the records from both sent to the database for the record chosen as the one to enter (orders is an example), for other tables you will want to choose which is correct (address for instance). So you can see this is a complex process requiring a thorough understanding othe database.

If you have a lot of duplicates, they may be cleaning up and adding the data for several months, so a tool is really critical. The people doing this will likely be system users not database specialists or programmers as they are the only people who truly can make the judgement most of the itme as to which record to keep. Likely you will need to do something simliar in any event as there may be records which are duplicates even when you have an auto-generated key. They are just more difficult to find.

There is no easy way to merge two databases (even using GUIDS, you have the problem of duplicates in the natural key).




回答2:


I know this is an old topic but I have to comment on the general approach I see in many posts and that is trying to do everything natively using SQL queries. What such solutions have in common is the fairly large amount of time that needs to be spent on creating and testing a query before applying it.

So yes – you can merge two databases natively using relatively complex queries but you can save yourself a ton of time and use third party tools for free (most or all have fully functional free trial).

There are ton of these on the market. Red Gate, already mentioned in other post, is one of the best but you can also try ApexSQL Data Diff, dbForge, SQL Comparison toolset and many others.




回答3:


Best bet would probably be going with a 3rd party application such as RedGate SQL Data Compare. Costs some money, but it's worth it over writing that script IMO.




回答4:


Here is how I did this twice in recent years: http://byalexblog.net/merge-sql-databases




回答5:


For if you have Primary keys as IDENTITY here is my suggestion (shouldn't require modifying the schema).

  1. Set up all foreign keys so that ON UPDATE CASCADE is set
  2. Update the Primary Key / IDENTITY Field in the parent table and add the max value of the field of the corresponding table you are going to merge into (the FKs will then cascade the values to the child tables)
  3. Do the same for the PK / IDENTITY fields in the child tables
  4. Follow the suggestion from this forum answer and use SET IDENTITY_INSERT ON / OFF either side of Inserting each of the tables, starting with the parent table and then moving on to the child tables



回答6:


You could just add an additional field (called DatabaseID for example) to the all the tables in your merged database and add it to the Primary Keys. This way you can keep the original keys, while having unique keys in the merged database - and you can tell which database the row has come from. This is what SQL-Hub does - if it's just a one off job you can do this with the free trial.



来源:https://stackoverflow.com/questions/909541/how-to-merge-two-databases-in-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!