Best way to update 40 million rows in batch

后端 未结 4 974
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-05 07:56

Basically I need to run this on a table with 40 million rows, updating every row at once will crash, so I want to batch the query so that if it crash, it can re-run the quer

相关标签:
4条回答
  • 2021-01-05 08:30
    Select 1;  -- this will set a rowcount
    WHILE (@@Rowcount > 0)   
    BEGIN
      UPDATE TOP (1000000) [table]   
        SET [New_ID] =  [Old_ID]
      WHERE [New_ID] <> [Old_ID] 
        or ([New_ID] is null and [Old_ID] is not null)
    END
    

    100000 may work better for the top.

    Since NewID and OldID is not null then the is null check is not necessary.

    0 讨论(0)
  • 2021-01-05 08:42

    Fastest way is to :

    1) Create a temp table and insert all the values from old to temp table using the create(select having condition) statement.

    2) Copy the constraints and refresh the indexes.

    3) Drop the old table.

    4) Rename temp table to original name.

    Complete discussion is available on this link

    0 讨论(0)
  • 2021-01-05 08:53
    Declare @Rowcount INT = 1;
    
    WHILE (@Rowcount > 0)   
    BEGIN
            UPDATE TOP (100000) [table]   --<-- define Batch Size in TOP Clause
               SET [New_ID] = [Old_ID]
            WHERE [New_ID] <> [Old_ID]
    
            SET @Rowcount = @@ROWCOUNT;
    
           CHECKPOINT;   --<-- to commit the changes with each batch
    END
    
    0 讨论(0)
  • 2021-01-05 08:54

    M.Ali's suggestion will work, but you will end up with degrading performance as you work through the 40M records. I would suggest a better filter to find the records to update in each pass. This would assume you have a primary key (or other index) on your identity column:

    DECLARE @Rowcount INT = 1
        ,   @BatchSize INT = 100000
        ,   @StartingRecord BIGINT = 1;
    
    WHILE (@Rowcount > 0)   
    BEGIN
        UPDATE [table]
            SET [New_ID] = [Old_ID]
        WHERE [table_ID] BETWEEN @StartingRecord AND @StartingRecord + @BatchSize - 1;
    
        SET @Rowcount = @@ROWCOUNT;
    
        CHECKPOINT;
    
        SELECT @StartingRecord += @BatchSize
    END
    

    This approach will allow each iteration to be as fast as the first. And if you don't have a valid index you need to fix that first.

    0 讨论(0)
提交回复
热议问题