We are having some trouble designing our multi-threaded Entity Framework driven application and would like some guidance. We are creating entities on different threads, the
I'm using Blazor Server Side with DbContext.
I've actually done your second way, and it's been working alright. Be careful of untracked entities.
A DbContext that is scope lived which is an interface that implements IReadOnlyApplicationDbContext that only provides the IQueryable for the DbSets (no SaveChanges or anything like that). So all read operations are safely executed without having the problem of an update messing around with the data.
Also, all queries use "AsNoTracking" to prevent saving tracks of the last query in cache.
After that, all write/update/delete updates are made by unique DbContexts.
It became something like that:
public interface IReadOnlyApplicationDbContext
{
DbSet<Product> Products { get; }
}
public interface IApplicationDbContext : IReadOnlyDbContext
{
DbSet<Product> Products { set; }
}
public class ApplicationDbContext : DbContext, IApplicationDbContext
{
DbSet<Product> Products { get; set; }
}
public abstract class ProductRepository
{
private readonly IReadOnlyApplicationDbContext _readOnlyApplicationDbContext;
private readonly IFactory<IApplicationDbContext> _applicationDbContextFactory;
protected Repository(
IReadOnlyApplicationDbContext readOnlyApplicationDbContext,
IFactory<IApplicationDbContext> applicationDbContextFactory
)
{
_readOnlyApplicationDbContext = readOnlyApplicationDbContext;
_applicationDbContextFactory = _applicationDbContextFactory;
}
private IQueryable<Product> ReadOnlyQuery() => _readOnlyApplicationDbContext.AsNoTracking();
public Task<IEnumerable<Products>> Get()
{
return ReadOnlyQuery().Where(s=>s.SKU == "... some data ...");
}
public Task Update(Product product)
{
using (var db = _applicationDbContextFactory.Create())
{
db.Entity(product).State = EntityState.Modified;
return db.SaveChangesAsync();
}
}
public Task Add(Product product)
{
using (var db = _applicationDbContextFactory.Create())
{
db.Products.AddAsync(product);
return db.SaveChangesAsync();
}
}
}
It has been sometime since the question asked but I've recently bumped into a similiar type of problem and ended up doing the below which helped us meeting the performance criteria.
You basically split your list into chunks and process them within sperate threads in a multi-threaded manner. Each new thread also initiates their own uow which requires your entities to be attached.
One thing to be aware of is your database needs to be enabled for snapshot isolation; otherwise you might end up with deadlocks. You would need to decide if this is ok for the operation you are doing and related business process. In our case it was a simple update on the product entity.
You would probably need to do some tests to decide on the best chunk size and also limit the parallelism so there is always resource to complete the operation.
private void PersistProductChangesInParallel(List<Product> products,
Action<Product, string> productOperationFunc,
string updatedBy)
{
var productsInChunks = products.ChunkBy(20);
Parallel.ForEach(
productsInChunks,
new ParallelOptions { MaxDegreeOfParallelism = 20 },
productsChunk =>
{
try
{
using (var transactionScope = new TransactionScope(
TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.Snapshot }))
{
var dbContext = dbContextFactory.CreatedbContext();
foreach (var Product in productsChunk)
{
dbContext.products.Attach(Product);
productOperationFunc(Product, updatedBy);
}
dbContext.SaveChanges();
transactionScope.Complete();
}
}
catch (Exception e)
{
Log.Error(e);
throw new ApplicationException("Some products might not be updated", e);
}
});
}
I have seem a dozen of stackoverflow threads concerning EF and multithreading. All of them have answers that explains the problem in depth but not really show you how to fix it.
EF is not thread safe, we all know that by now. But in my experience the only risk is the context creations/manipulations. There is actually a very simple fix for this where you can keep your lazy loading.
Lets say you have a WPF app and a MVC website. where the WPF app uses multihreading. You just dispose the db context in multithreading and keep it when not. Example a MVC website, the context will automaticly disposes after the view has been presented.
In the WPF application layer you use this:
ProductBLL productBLL = new ProductBLL(true);
In the MVC application layer you use this:
ProductBLL productBLL = new ProductBLL();
How your product business logic layer should look like:
public class ProductBLL : IProductBLL
{
private ProductDAO productDAO; //Your DB layer
public ProductBLL(): this(false)
{
}
public ProductBLL(bool multiThreaded)
{
productDAO = new ProductDAO(multiThreaded);
}
public IEnumerable<Product> GetAll()
{
return productDAO.GetAll();
}
public Product GetById(int id)
{
return productDAO.GetById(id);
}
public Product Create(Product entity)
{
return productDAO.Create(entity);
}
//etc...
}
How your database logic layer should look like:
public class ProductDAO : IProductDAO
{
private YOURDBCONTEXT db = new YOURDBCONTEXT ();
private bool _MultiThreaded = false;
public ProductDAO(bool multiThreaded)
{
_MultiThreaded = multiThreaded;
}
public IEnumerable<Product> GetAll()
{
if (_MultiThreaded)
{
using (YOURDBCONTEXT db = new YOURDBCONTEXT ())
{
return db.Product.ToList(); //USE .Include() For extra stuff
}
}
else
{
return db.Product.ToList();
}
}
public Product GetById(int id)
{
if (_MultiThreaded)
{
using (YOURDBCONTEXT db = new YOURDBCONTEXT ())
{
return db.Product.SingleOrDefault(x => x.ID == id); //USE .Include() For extra stuff
}
}
else
{
return db.Product.SingleOrDefault(x => x.ID == id);
}
}
public Product Create(Product entity)
{
if (_MultiThreaded)
{
using (YOURDBCONTEXT db = new YOURDBCONTEXT ())
{
db.Product.Add(entity);
db.SaveChanges();
return entity;
}
}
else
{
db.Product.Add(entity);
db.SaveChanges();
return entity;
}
}
//etc...
}
I've just had a project where trying to use EF with multi threading caused errors.
I tried
using (var context = new entFLP(entity_connection))
{
context.Product.Add(entity);
context.SaveChanges();
return entity;
}
but it just changed the type of the error from datareader error to multiple thread error.
The simple solution was to use stored procedures with EF function imports
using (var context = new entFLP(entity_connection))
{
context.fi_ProductAdd(params etc);
}
The key is to go to the source of the data and avoid the data model.
You do -not- want a long lived context. Ideally they should be for the life of a request/data operation.
When dealing with similar problems, I wound up implementing a repository that cached PK Entities for a given type, and allowed a 'LoadFromDetached' which would Find the entity in the database, and 'copy' all the scalar properties except PK over to the newly attached entity.
The performance will take a bit of a hit, but it provides a bullet proof way of making sure navigation properties don't get mangled by 'forgetting' about them.
First off, I'm assuming you have read the article "Multithreading and the Entity Framework" on MSDN.
Solution #1 is almost certainly the safest from a threading perspective, since you are guaranteeing that only one thread is interacting with the context at any give time. There is nothing inherently wrong with keeping the context around- it isn't keeping database connections open behind the scenes, so it's just memory overhead. This could, of course, present a performance problem if you end up bottlenecked on that thread and the whole application is written with single-db-thread assumptions.
Solution #2 seems unworkable to me- you'd end up with subtle bugs throughout the application where people forget to re-attach (or detach) entities.
One solution is to not use your entity objects in the UI layer of the application. I'd recommend this anyway- chances are, the structure/layout of the entity objects is not optimal for how you want to display things on your user interface (this is the reason for the family of MVC patterns). Your DAL should have methods which are business logic specific (UpdateCustomer
for instance), and it should decide internally whether to create a new Context or use a stored one. You can start with the single stored context approach, and then if you run into bottleneck issues you have a limited surface area where you need to make changes.
The downside is that you need to write a lot more code- you'd have your EF entities, but you'd also have business entities that have duplicate properties and potentially differing cardinality of many of the EF entities. To mitigate this, you can use frameworks like AutoMapper to simplify copying properties from the EF entities to the business entities and back again.