SqlBulkCopy slow as molasses

坚强是说给别人听的谎言 提交于 2019-12-21 17:06:34

问题


I'm looking for the fastest way to load bulk data via c#. I have this script that does the job but slow. I read testimonies that SqlBulkCopy is the fastest.
1000 records 2.5 seconds. files contain anywhere near 5000 records to 250k What are some of the things that can slow it down?

Table Def:

CREATE TABLE [dbo].[tempDispositions](
    [QuotaGroup] [varchar](100) NULL,
    [Country] [varchar](50) NULL,
    [ServiceGroup] [varchar](50) NULL,
    [Language] [varchar](50) NULL,
    [ContactChannel] [varchar](10) NULL,
    [TrackingID] [varchar](20) NULL,
    [CaseClosedDate] [varchar](25) NULL,
    [MSFTRep] [varchar](50) NULL,
    [CustEmail] [varchar](100) NULL,
    [CustPhone] [varchar](100) NULL,
    [CustomerName] [nvarchar](100) NULL,
    [ProductFamily] [varchar](35) NULL,
    [ProductSubType] [varchar](255) NULL,
    [CandidateReceivedDate] [varchar](25) NULL,
    [SurveyMode] [varchar](1) NULL,
    [SurveyWaveStartDate] [varchar](25) NULL,
    [SurveyInvitationDate] [varchar](25) NULL,
    [SurveyReminderDate] [varchar](25) NULL,
    [SurveyCompleteDate] [varchar](25) NULL,
    [OptOutDate] [varchar](25) NULL,
    [SurveyWaveEndDate] [varchar](25) NULL,
    [DispositionCode] [varchar](5) NULL,
    [SurveyName] [varchar](20) NULL,
    [SurveyVendor] [varchar](20) NULL,
    [BusinessUnitName] [varchar](25) NULL,
    [UploadId] [int] NULL,
    [LineNumber] [int] NULL,
    [BusinessUnitSubgroup] [varchar](25) NULL,
    [FileDate] [datetime] NULL
) ON [PRIMARY]

and here's the code

    private void BulkLoadContent(DataTable dt)
    {
        OnMessage("Bulk loading records to temp table");
        OnSubMessage("Bulk Load Started");
        using (SqlBulkCopy bcp = new SqlBulkCopy(conn))
        {
            bcp.DestinationTableName = "dbo.tempDispositions";
            bcp.BulkCopyTimeout = 0;
            foreach (DataColumn dc in dt.Columns)
            {
                bcp.ColumnMappings.Add(dc.ColumnName, dc.ColumnName);
            }
            bcp.NotifyAfter = 2000;
            bcp.SqlRowsCopied += new SqlRowsCopiedEventHandler(bcp_SqlRowsCopied);
            bcp.WriteToServer(dt);
            bcp.Close();
        }
    }

回答1:


Do you have any indexes, triggers or constraints on that table?

That will cause slowdowns on insert - especially a clustered index would hurt. When blasting the amounts of data you're doing, it's best to drop indexes first, and re-apply them afterwards.

A good post about it is here: What's the fastest way to bulk insert a lot of data in SQL Server (C# client)




回答2:


If you have lots of data, setting the batchsize to a reasonably large number might help:

bcp.BatchSize = 10000;



回答3:


Things that can slow down the bulk copy : -Full text indexes on the table -Triggers on Insert -Foreign-Key constraints




回答4:


I've noticed that trying to flush large datasets is initially much faster, but slows down substantially over time. I've found a modest increase in performance using a buffered approach, feeding bulkcopy just a few thousand records at a time under the same connection. It seems to keep the per-batch transaction time down over time, which (over time), improves performance. On my solution, I've noted that the same method un-buffered will save about 5,000,000 records in the time it takes this method to save about 7,500,000 records of the same type to the same DB. Hope this helps someone.

public void flush_DataTable(DataTable dt, string tableName)//my incoming DTs have a million or so each and slow down over time to nothing. This helps.
    {  int bufferSize = 10000;
        int bufferHigh = bufferSize;
        int lowBuffer = 0;
        if (dt.Rows.Count >= bufferSize)
        {  using (SqlConnection conn = getConn())
            {   conn.Open();
                while (bufferHigh < dt.Rows.Count)
                {
                    using (SqlBulkCopy s = new SqlBulkCopy(conn))
                    {   s.BulkCopyTimeout = 900;
                        s.DestinationTableName = tableName;
                        s.BatchSize = bufferSize;

                        s.EnableStreaming = true;
                        foreach (var column in dt.Columns)
                            s.ColumnMappings.Add(column.ToString(), column.ToString());
                        DataTable bufferedTable = dt.Clone();
                        for (int bu = lowBuffer; bu < bufferHigh; bu++)
                        {
                            bufferedTable.ImportRow(dt.Rows[bu]);
                        }
                        s.WriteToServer(bufferedTable);
                        if (bufferHigh == dt.Rows.Count)
                        {
                            break;
                        }
                        lowBuffer = bufferHigh;
                        bufferHigh += bufferSize;

                        if (bufferHigh > dt.Rows.Count)
                        {
                            bufferHigh = dt.Rows.Count;
                        }
                    }
                }
                conn.Close();
            }
        }
        else
        {
            flushDataTable(dt, tableName);//perofrm a non-buffered flush (could just as easily flush the buffer here bu I already had the other method 
        }
    }



回答5:


The IDataReader implementation I sugested here How to implement IDataReader? maybe helps you. I used it with SqlBulkCopy as follows:

using (MyFileDataReader reader = new MyFileDataReader(@"C:\myfile.txt"))
 {
      SqlBulkCopy bulkCopy = new SqlBulkCopy(connection);
      bulkCopy.DestinationTableName = "[my_table]";
      bulkCopy.BatchSize = 10000;

      bulkCopy.WriteToServer(reader);

      bulkCopy.Close();

 } 


来源:https://stackoverflow.com/questions/2692654/sqlbulkcopy-slow-as-molasses

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!