Fastest Way of Inserting in Entity Framework

前端 未结 30 2099
鱼传尺愫
鱼传尺愫 2020-11-21 05:23

I\'m looking for the fastest way of inserting into Entity Framework.

I\'m asking this because of the scenario where you have an active TransactionScope a

相关标签:
30条回答
  • 2020-11-21 05:54

    SqlBulkCopy is super quick

    This is my implementation:

    // at some point in my calling code, I will call:
    var myDataTable = CreateMyDataTable();
    myDataTable.Rows.Add(Guid.NewGuid,tableHeaderId,theName,theValue); // e.g. - need this call for each row to insert
    
    var efConnectionString = ConfigurationManager.ConnectionStrings["MyWebConfigEfConnection"].ConnectionString;
    var efConnectionStringBuilder = new EntityConnectionStringBuilder(efConnectionString);
    var connectionString = efConnectionStringBuilder.ProviderConnectionString;
    BulkInsert(connectionString, myDataTable);
    
    private DataTable CreateMyDataTable()
    {
        var myDataTable = new DataTable { TableName = "MyTable"};
    // this table has an identity column - don't need to specify that
        myDataTable.Columns.Add("MyTableRecordGuid", typeof(Guid));
        myDataTable.Columns.Add("MyTableHeaderId", typeof(int));
        myDataTable.Columns.Add("ColumnName", typeof(string));
        myDataTable.Columns.Add("ColumnValue", typeof(string));
        return myDataTable;
    }
    
    private void BulkInsert(string connectionString, DataTable dataTable)
    {
        using (var connection = new SqlConnection(connectionString))
        {
            connection.Open();
            SqlTransaction transaction = null;
            try
            {
                transaction = connection.BeginTransaction();
    
                using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction))
                {
                    sqlBulkCopy.DestinationTableName = dataTable.TableName;
                    foreach (DataColumn column in dataTable.Columns) {
                        sqlBulkCopy.ColumnMappings.Add(column.ColumnName, column.ColumnName);
                    }
    
                    sqlBulkCopy.WriteToServer(dataTable);
                }
                transaction.Commit();
            }
            catch (Exception)
            {
                transaction?.Rollback();
                throw;
            }
        }
    }
    
    0 讨论(0)
  • 2020-11-21 05:54

    Use stored procedure that takes input data in form of xml to insert data.

    From your c# code pass insert data as xml.

    e.g in c#, syntax would be like this:

    object id_application = db.ExecuteScalar("procSaveApplication", xml)
    
    0 讨论(0)
  • 2020-11-21 05:55

    I know this is a very old question, but one guy here said that developed an extension method to use bulk insert with EF, and when I checked, I discovered that the library costs $599 today (for one developer). Maybe it makes sense for the entire library, however for just the bulk insert this is too much.

    Here is a very simple extension method I made. I use that on pair with database first (do not tested with code first, but I think that works the same). Change YourEntities with the name of your context:

    public partial class YourEntities : DbContext
    {
        public async Task BulkInsertAllAsync<T>(IEnumerable<T> entities)
        {
            using (var conn = new SqlConnection(Database.Connection.ConnectionString))
            {
                await conn.OpenAsync();
    
                Type t = typeof(T);
    
                var bulkCopy = new SqlBulkCopy(conn)
                {
                    DestinationTableName = GetTableName(t)
                };
    
                var table = new DataTable();
    
                var properties = t.GetProperties().Where(p => p.PropertyType.IsValueType || p.PropertyType == typeof(string));
    
                foreach (var property in properties)
                {
                    Type propertyType = property.PropertyType;
                    if (propertyType.IsGenericType &&
                        propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                    {
                        propertyType = Nullable.GetUnderlyingType(propertyType);
                    }
    
                    table.Columns.Add(new DataColumn(property.Name, propertyType));
                }
    
                foreach (var entity in entities)
                {
                    table.Rows.Add(
                        properties.Select(property => property.GetValue(entity, null) ?? DBNull.Value).ToArray());
                }
    
                bulkCopy.BulkCopyTimeout = 0;
                await bulkCopy.WriteToServerAsync(table);
            }
        }
    
        public void BulkInsertAll<T>(IEnumerable<T> entities)
        {
            using (var conn = new SqlConnection(Database.Connection.ConnectionString))
            {
                conn.Open();
    
                Type t = typeof(T);
    
                var bulkCopy = new SqlBulkCopy(conn)
                {
                    DestinationTableName = GetTableName(t)
                };
    
                var table = new DataTable();
    
                var properties = t.GetProperties().Where(p => p.PropertyType.IsValueType || p.PropertyType == typeof(string));
    
                foreach (var property in properties)
                {
                    Type propertyType = property.PropertyType;
                    if (propertyType.IsGenericType &&
                        propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                    {
                        propertyType = Nullable.GetUnderlyingType(propertyType);
                    }
    
                    table.Columns.Add(new DataColumn(property.Name, propertyType));
                }
    
                foreach (var entity in entities)
                {
                    table.Rows.Add(
                        properties.Select(property => property.GetValue(entity, null) ?? DBNull.Value).ToArray());
                }
    
                bulkCopy.BulkCopyTimeout = 0;
                bulkCopy.WriteToServer(table);
            }
        }
    
        public string GetTableName(Type type)
        {
            var metadata = ((IObjectContextAdapter)this).ObjectContext.MetadataWorkspace;
            var objectItemCollection = ((ObjectItemCollection)metadata.GetItemCollection(DataSpace.OSpace));
    
            var entityType = metadata
                    .GetItems<EntityType>(DataSpace.OSpace)
                    .Single(e => objectItemCollection.GetClrType(e) == type);
    
            var entitySet = metadata
                .GetItems<EntityContainer>(DataSpace.CSpace)
                .Single()
                .EntitySets
                .Single(s => s.ElementType.Name == entityType.Name);
    
            var mapping = metadata.GetItems<EntityContainerMapping>(DataSpace.CSSpace)
                    .Single()
                    .EntitySetMappings
                    .Single(s => s.EntitySet == entitySet);
    
            var table = mapping
                .EntityTypeMappings.Single()
                .Fragments.Single()
                .StoreEntitySet;
    
            return (string)table.MetadataProperties["Table"].Value ?? table.Name;
        }
    }
    

    You can use that against any collection that inherit from IEnumerable, like that:

    await context.BulkInsertAllAsync(items);
    
    0 讨论(0)
  • 2020-11-21 05:57

    This combination increase speed well enough.

    context.Configuration.AutoDetectChangesEnabled = false;
    context.Configuration.ValidateOnSaveEnabled = false;
    
    0 讨论(0)
  • 2020-11-21 05:58

    You should look at using the System.Data.SqlClient.SqlBulkCopy for this. Here's the documentation, and of course there are plenty of tutorials online.

    Sorry, I know you were looking for a simple answer to get EF to do what you want, but bulk operations are not really what ORMs are meant for.

    0 讨论(0)
  • 2020-11-21 05:58

    I've investigated Slauma's answer (which is awesome, thanks for the idea man), and I've reduced batch size until I've hit optimal speed. Looking at the Slauma's results:

    • commitCount = 1, recreateContext = true: more than 10 minutes
    • commitCount = 10, recreateContext = true: 241 sec
    • commitCount = 100, recreateContext = true: 164 sec
    • commitCount = 1000, recreateContext = true: 191 sec

    It is visible that there is speed increase when moving from 1 to 10, and from 10 to 100, but from 100 to 1000 inserting speed is falling down again.

    So I've focused on what's happening when you reduce batch size to value somewhere in between 10 and 100, and here are my results (I'm using different row contents, so my times are of different value):

    Quantity    | Batch size    | Interval
    1000    1   3
    10000   1   34
    100000  1   368
    
    1000    5   1
    10000   5   12
    100000  5   133
    
    1000    10  1
    10000   10  11
    100000  10  101
    
    1000    20  1
    10000   20  9
    100000  20  92
    
    1000    27  0
    10000   27  9
    100000  27  92
    
    1000    30  0
    10000   30  9
    100000  30  92
    
    1000    35  1
    10000   35  9
    100000  35  94
    
    1000    50  1
    10000   50  10
    100000  50  106
    
    1000    100 1
    10000   100 14
    100000  100 141
    

    Based on my results, actual optimum is around value of 30 for batch size. It's less than both 10 and 100. Problem is, I have no idea why is 30 optimal, nor could have I found any logical explanation for it.

    0 讨论(0)
提交回复
热议问题