How to remove duplicate rows from flat file using SSIS?

后端未结

关注

 9  2088

轻奢々 2021-01-12 21:42

Let me first say that being able to take 17 million records from a flat file, pushing to a DB on a remote box and having it take 7 minutes is amazing. SSIS truly is fantasti

9条回答

一向 (楼主)

2021-01-12 22:01

I would suggest using SSIS to copy the records to a temporary table, then create a task that uses Select Distinct or Rank depending on your situation to select the duplicates which would funnel them to a flat file and delete them from the temporary table. The last step would be to copy the records from the temporary table into the destination table.

Determining a duplicate is something SQL is good at but a flat file is not as well suited for. In the case you proposed, the script container would load a row and then would have to compare it against 17 million records, then load the next row and repeat...The performance might not be all that great.

0 讨论(0)

查看其它9个回答
发布评论:

提交评论
- 加载中...