Speed up LINQ inserts

前端未结

关注

 11  1612

自闭症患者

I have a CSV file and I have to insert it into a SQL Server database. Is there a way to speed up the LINQ inserts?

I\'ve created a simple Repository method to save a rec

相关标签:

11条回答

[愿得一人]

2021-02-01 08:07
This code runs ok, and prevents large amounts of data:
```
if (repository2.GeoItems.GetChangeSet().Inserts.Count > 1000)
{
    repository2.GeoItems.SubmitChanges();
}
```
Then, at the end of the bulk insertion, use this:
```
repository2.GeoItems.SubmitChanges();
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
北荒

2021-02-01 08:08

I wonder if you're suffering from an overly large set of data accumulating in the data-context, making it slow to resolve rows against the internal identity cache (which is checked once during the SingleOrDefault, and for "misses" I would expect to see a second hit when the entity is materialized).

I can't recall 100% whether the short-circuit works for SingleOrDefault (although it will in .NET 4.0).

I would try ditching the data-context (submit-changes and replace with an empty one) every n operations for some n - maybe 250 or something.

Given that you're calling SubmitChanges per isntance at the moment, you may also be wasting a lot of time checking the delta - pointless if you've only changed one row. Only call SubmitChanges in batches; not per record.

0 讨论(0)
发布评论:

提交评论
- 加载中...
没有蜡笔的小新

2021-02-01 08:08

Alex gave the best answer, but I think a few things are being over looked.

One of the major bottlenecks you have here is calling SubmitChanges for each item individually. A problem I don't think most people know about is that if you haven't manually opened your DataContext's connection yourself, then the DataContext will repeatedly open and close it itself. However, if you open it yourself, and then close it yourself when you're absolutely finished, things will run a lot faster since it won't have to reconnect to the database every time. I found this out when trying to find out why DataContext.ExecuteCommand() was so unbelievably slow when executing multiple commands at once.

A few other areas where you could speed things up:

While Linq To SQL doesn't support your straight up batch processing, you should wait to call SubmitChanges() until you've analyzed everything first. You don't need to call SubmitChanges() after each InsertOnSubmit call.

If live data integrity isn't super crucial, you could retrieve a list of offer_id back from the server before you start checking to see if an offer already exists. This could significantly reduce the amount of times you're calling the server to get an existing item when it's not even there.

0 讨论(0)
发布评论:

提交评论
- 加载中...
北海茫月

2021-02-01 08:08

Do you really need to check if the record exist before inserting it into the DB. I thought it looked strange as the data comes from a csv file.

P.S. I've tried to use SqlBulkCopy, but I need to do some transformations on Offer before inserting it into the db, and I think that defeats the purpose of SqlBulkCopy.

I don't think it defeat the purpose at all, why would it? Just fill a simple dataset with all the data from the csv and do a SqlBulkCopy. I did a similar thing with a collection of 30000+ rows and the import time went from minutes to seconds

0 讨论(0)
发布评论:

提交评论
- 加载中...
后悔当初

2021-02-01 08:14

Well you must understand linq creates code dynamically for all ADO operations that you do instead handwritten, so it will always take up more time then your manual code. Its simply an easy way to write code but if you want to talk about performance, ADO.NET code will always be faster depending upon how you write it.

I dont know if linq will try to reuse its last statement or not, if it does then seperating insert batch with update batch may improve performance little bit.

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2