Faster (more scalable) DataSet.Merge?

后端 未结 5 812
孤街浪徒
孤街浪徒 2021-01-25 11:52

We use strongly typed DataSets in our application. When importing data we use the convenient DataSet.Merge() operation to copy DataRows from one DataSet to another.

相关标签:
5条回答
  • 2021-01-25 12:17

    The obvious answer is 'do it in the DB' -- I'll assume it's not applicable in your case.

    You should try to use a row loop. This can be quite performant if the tables you are merging are sorted.

    http://en.wikipedia.org/wiki/Merge_algorithm

    0 讨论(0)
  • 2021-01-25 12:31

    You've probably already tried this, but just in case:

    DataSet.Merge takes an array or DataRows as a parameter.

    Have you tried batching the merge, i.e. doing the following?

    dataSet1.Merge(lines.Select(line=>ImportRow(line)).ToArray());
    

    However, it's quite possibly the case that you cannot improve performance - maybe you can avoid the need to Merge in the first place somehow - such as by doing the merge in the DB, as Sklivvz suggests.

    0 讨论(0)
  • 2021-01-25 12:42

    Why not just add rows? Or do it in the DB as 'Skliwz' suggests?

    0 讨论(0)
  • 2021-01-25 12:42

    Can't you just add or update the row depending the row exists or not in the table (using the not typed method "table.Rows.Find(primaryKeyValues)")?

    Note that you can have a lot of scalability problems with the DataSet (compared to a DB):
    - no transaction => no concurreny.
    - slow load from xml (maybe it is faster/linear from DB).
    - missing index (except primary key).
    - do not cache as a DB, it can be a problem in ram limited system (in 32b system).

    0 讨论(0)
  • 2021-01-25 12:42

    The best merging algorithm I know is Sort-Merge if your input datasets are sorted by the same attribute. But I do not know C# so well to say if it is possible to force ADO object to use this algorithm.

    0 讨论(0)
提交回复
热议问题