SQL Server insert performance

后端 未结 8 1125
终归单人心
终归单人心 2020-12-29 06:01

I have an insert query that gets generated like this

INSERT INTO InvoiceDetail (LegacyId,InvoiceId,DetailTypeId,Fee,FeeTax,Investigatorid,SalespersonId,Creat         


        
相关标签:
8条回答
  • 2020-12-29 06:29

    Sounds like the inserts are causing SQL Server to recalculate the indexes. One possible solution would be to drop the index, perform the insert, and re-add the index. With your attempted solution, even if you tell it to ignore constraints, it will still need to keep the index updated.

    0 讨论(0)
  • 2020-12-29 06:29

    Some suggestions for increasing insert performance:

    • Increase ADO.NET BatchSize
    • Choose the target table's clustered index wisely, so that inserts won't lead to clustered index node splits (e.g. autoinc column)
    • Insert into a temporary heap table first, then issue one big "insert-by-select" statement to push all that staging table data into the actual target table
    • Apply SqlBulkCopy
    • Place a table lock before inserting (if your business scenario allows for it)

    Taken from Tips For Lightning-Fast Insert Performance On SqlServer

    0 讨论(0)
  • 2020-12-29 06:37

    Most likely this is commit flush wait. If you don't wrap sets of INSERTs into explicitly managed transaction then each INSERT is its own auto-committed transaction. Meaning each INSERT issues automatically a commit, and a commit has to wait until the log is durable (ie. written to disk). Flushing the log after each insert is extremely slow.

    For instance, trying to insert 100k rows like yours on a single row commit style:

    set nocount on; 
    declare @start datetime = getutcdate();  
    
    declare @i int = 0;
    while @i < 100000
    begin
    INSERT INTO InvoiceDetail (
      LegacyId,InvoiceId,DetailTypeId,Fee,
      FeeTax,Investigatorid,SalespersonId,
      CreateDate,CreatedById,IsChargeBack,
      Expense,RepoAgentId,PayeeName,ExpensePaymentId,
      AdjustDetailId) 
      VALUES(1,1,2,1500.0000,0.0000,163,1002,
        '11/30/2001 12:00:00 AM',
        1116,0,550.0000,850,NULL,1,NULL); 
      set @i = @i+1;
    end
    
    select datediff(ms, @start, getutcdate());
    

    This runs in about 12seconds on my server. But adding transaction management and committing every 1000 rows the insert of 100k rows lasts only about 4s:

    set nocount on;  
    declare @start datetime = getutcdate();  
    
    declare @i int = 0;
    begin transaction
    while @i < 100000
    begin
    INSERT INTO InvoiceDetail (
      LegacyId,InvoiceId,DetailTypeId,
      Fee,FeeTax,Investigatorid,
      SalespersonId,CreateDate,CreatedById,
      IsChargeBack,Expense,RepoAgentId,
      PayeeName,ExpensePaymentId,AdjustDetailId) 
      VALUES(1,1,2,1500.0000,0.0000,163,1002,
        '11/30/2001 12:00:00 AM',
        1116,0,550.0000,850,NULL,1,NULL); 
      set @i = @i+1;
      if (@i%1000 = 0)
      begin
        commit
        begin transaction
      end  
    end
    commit;
    select datediff(ms, @start, getutcdate());
    

    Also given that I can insert 100k rows in 12 seconds even w/o the batch commit, while you need 30 minutes, its worth investigating 1) the speed of your IO subsystem (eg. what Avg. Sec per Transaction you see on the drives) and 2) what else is the client code doing between retrieving the @@identity from one call and invoking the next insert. It could be that the bulk of time is in the client side of the stack. One simple solution would be to launch multiple inserts in parallel (BeginExecuteNonQuery) so you feed the SQL Server inserts constantly.

    0 讨论(0)
  • 2020-12-29 06:40

    Are you executing these queries one at a time from a .Net client (i.e. sending 110,000 separate query requests to SQL Server)?

    In that case, it's likely that it's the network latency and other overhead of sending these INSERTs to the SQL Server without batching them, not SQL Server itself.

    Check out BULK INSERT.

    0 讨论(0)
  • 2020-12-29 06:41

    Running individual INSERTs is always going to be the slowest option. Also - what's the deal with the @@IDENTITY - doesn't look like you ned to keep track of those in between.

    If you don't want to use BULK INSERT from file or SSIS, there is a SqlBulkCopy feature in ADO.NET which would probably be your best bet if you absolutely have to do this from within a .NET program.

    110k rows should take less time to import than me reseaching and writing this answer.

    0 讨论(0)
  • 2020-12-29 06:41

    Hm, let it run, check performance counters. what do you see? What disc layout do you have? I can insert some million rows in 30 minutes - nearly a hundred million rows, to be exact (real time financial information, linkes to 3 other tables). I pretty much bet that your IO layout is bad (i.e. bad disc structure, bad file distribution)

    0 讨论(0)
提交回复
热议问题