SQL Server Lock Timeout Exceeded Deleting Records in a Loop

前端 未结 2 790
悲哀的现实
悲哀的现实 2021-01-12 05:24

I am testing a process that deletes many, many records at once. It cannot TRUNCATE TABLE, because there are records in there that need to stay.

Because

相关标签:
2条回答
  • 2021-01-12 05:41

    You or somebody else using the connection is setting the lock timeout to something other than the default. See http://msdn.microsoft.com/en-US/library/ms189470(v=SQL.90).aspx for details.

    The default lock time it is -1 milliseconds, meaning "Wait Forever".

    The row hints are nice, but they are a code smell and should be avoided. Let SQL Server do its job. It's got more information than you do about the system as a whole.

    For starters, you can't control the lock size: Lock escalation occurs automatically, based on the number of outstanding locks. It starts with row locks. If you accumulate too many row locks, SQL Server escalates to page lock. Acquire too many page locks and it escalates to table locks. See http://msdn.microsoft.com/en-us/library/ms184286(v=SQL.90).aspx for lock escalation details. There are a couple of trace flags you can set, however, that will prevent lock escalation: however, that will degrade the SQL Server's performance.

    Another thing: you should wrap the DELETE statement in a transaction, especially in a stored procedure.

    DECLARE @Count INT
    SET @Count = 1
    WHILE @Count > 0
      BEGIN
        BEGIN TRANSACTION
        DELETE TOP (1000) FROM MyTable WITH (ROWLOCK, READPAST) WHERE MyField = SomeValue
        SET @Count = @@ROWCOUNT
        COMMIT TRANSACTION
      END
    

    This makes clear your intent and ensures that locks are release when they ought to be.

    0 讨论(0)
  • 2021-01-12 05:52

    I've found the answer: my looped delete is conflicting with the ghost cleanup proc.

    Using Nicholas's suggestion, I added a BEGIN TRANSACTION and a COMMIT. I wrapped the delete loop in a BEGIN TRY / BEGIN CATCH. In the BEGIN CATCH, right before a ROLLBACK, I ran sp_lock and sp_who2. (I added the code changes in the question above.)

    When my process blocked, I saw the following output:

    spid   dbid   ObjId       IndId  Type Resource                         Mode     Status
    ------ ------ ----------- ------ ---- -------------------------------- -------- ------
    20     2      1401108082  0      TAB                                   IX       GRANT
    20     2      1401108082  1      PAG  1:102368                         X        GRANT
    
    SPID  Status     Login HostName BlkBy DBName Command       CPUTime DiskIO
    ----  ---------- ----- -------- ----- ------ ------------- ------- ------
    20    BACKGROUND sa    .        .     tempdb GHOST CLEANUP 31      0
    

    For future reference, when SQL Server deletes records, it sets a bit on them to just mark them as "ghost records". Every few minutes, an internal process called ghost cleanup runs to reclaim pages of records that have been fully deleted (i.e. all records are ghost records).

    The ghost cleanup process was discussed on ServerFault in this question.

    Here is Paul S. Randal's explanation of the ghost cleanup process.

    It is possible to disable the ghost cleanup process with a trace flag. But I did not have to do so in this case.

    I ended up adding a lock wait timeout of 100 ms. This causes occasional lock wait timeouts in the ghost record cleanup process, but that is acceptable. I also added an our loop that retries lock timeouts up to 5 times. With these two changes, my process now usually completes. Now it only gets a timeout if there is a very long process pushing lots of data around that acquires table or page locks on the data that my process needs to clean up.

    EDIT 2016-07-20

    The final code looks like this:

    -- Do not block long if records are locked.
    SET LOCK_TIMEOUT 100
    
    -- This process volunteers to be a deadlock victim in the case of a deadlock.
    SET DEADLOCK_PRIORITY LOW
    
    DECLARE @Error BIT
    SET @Error = 0
    
    DECLARE @ErrMsg VARCHAR(1000)
    DECLARE @DeletedCount INT
    SELECT @DeletedCount = 0
    
    DECLARE @LockTimeoutCount INT
    SET @LockTimeoutCount = 0
    
    DECLARE @ContinueDeleting BIT,
        @LastDeleteSuccessful BIT
    
    SET @ContinueDeleting = 1
    SET @LastDeleteSuccessful = 1
    
    WHILE @ContinueDeleting = 1
    BEGIN
        DECLARE @RowCount INT
        SET @RowCount = 0
    
        BEGIN TRY
    
            BEGIN TRANSACTION
    
            -- The READPAST below attempts to skip over locked records.
            -- However, it might still cause a lock wait error (1222) if a page or index is locked, because the delete has to modify indexes.
            -- The threshold for row lock escalation to table locks is around 5,000 records,
            -- so keep the deleted number smaller than this limit in case we are deleting a large chunk of data.
            -- Table name, field, and value are all set dynamically in the actual script.
            SET @SQL = N'DELETE TOP (1000) MyTable WITH(ROWLOCK, READPAST) WHERE MyField = SomeValue' 
            EXEC sp_executesql @SQL, N'@ProcGuid uniqueidentifier', @ProcGUID
    
            SET @RowCount = @@ROWCOUNT
    
            COMMIT
    
            SET @LastDeleteSuccessful = 1
    
            SET @DeletedCount = @DeletedCount + @RowCount
            IF @RowCount = 0
            BEGIN
                SET @ContinueDeleting = 0
            END
    
        END TRY
        BEGIN CATCH
    
            IF @@TRANCOUNT > 0
                ROLLBACK
    
            IF Error_Number() = 1222 -- Lock timeout
            BEGIN
    
                IF @LastDeleteSuccessful = 1
                BEGIN
                    -- If we hit a lock timeout, and we had already deleted something successfully, try again.
                    SET @LastDeleteSuccessful = 0
                END
                ELSE
                BEGIN
                    -- The last delete failed, too.  Give up for now.  The job will run again shortly.
                    SET @ContinueDeleting = 0
                END
            END
            ELSE -- On anything other than a lock timeout, report an error.
            BEGIN       
                SET @ErrMsg = 'An error occurred cleaning up data.  Table: MyTable Column: MyColumn Value: SomeValue.  Message: ' + ERROR_MESSAGE() + ' Error Number: ' + CONVERT(VARCHAR(20), ERROR_NUMBER()) + ' Line: ' + CONVERT(VARCHAR(20), ERROR_LINE())
                PRINT @ErrMsg -- this error message will be included in the SQL Server job history
                SET @Error = 1
                SET @ContinueDeleting = 0
            END
    
        END CATCH
    
    END
    
    IF @Error <> 0
        RAISERROR('Not all data could be cleaned up.  See previous messages.', 16, 1)
    
    0 讨论(0)
提交回复
热议问题