We use strongly typed DataSets in our application. When importing data we use the convenient DataSet.Merge()
operation to copy DataRows from one DataSet to another.
The obvious answer is 'do it in the DB' -- I'll assume it's not applicable in your case.
You should try to use a row loop. This can be quite performant if the tables you are merging are sorted.
http://en.wikipedia.org/wiki/Merge_algorithm
You've probably already tried this, but just in case:
DataSet.Merge takes an array or DataRows as a parameter.
Have you tried batching the merge, i.e. doing the following?
dataSet1.Merge(lines.Select(line=>ImportRow(line)).ToArray());
However, it's quite possibly the case that you cannot improve performance - maybe you can avoid the need to Merge in the first place somehow - such as by doing the merge in the DB, as Sklivvz suggests.
Why not just add rows? Or do it in the DB as 'Skliwz' suggests?
Can't you just add or update the row depending the row exists or not in the table (using the not typed method "table.Rows.Find(primaryKeyValues)")?
Note that you can have a lot of scalability problems with the DataSet (compared to a DB):
- no transaction => no concurreny.
- slow load from xml (maybe it is faster/linear from DB).
- missing index (except primary key).
- do not cache as a DB, it can be a problem in ram limited system (in 32b system).
The best merging algorithm I know is Sort-Merge if your input datasets are sorted by the same attribute. But I do not know C# so well to say if it is possible to force ADO object to use this algorithm.