Delete large amount of data in sql server

后端 未结 7 1347
误落风尘
误落风尘 2021-02-15 15:30

Suppose that I have a table with 10000000 record. What is difference between this two solution?

  1. delete data like :

    DELETE FROM MyTable
    
    
            
相关标签:
7条回答
  • 2021-02-15 15:49

    If you need to restrict to what rows you need to delete and not do a complete delete, or you can't use TRUNCATE TABLE (e.g. the table is referenced by a FK constraint, or included in an indexed view), then you can do the delete in chunks:

    DECLARE @RowsDeleted INTEGER
    SET @RowsDeleted = 1
    
    WHILE (@RowsDeleted > 0)
        BEGIN
            -- delete 10,000 rows a time
            DELETE TOP (10000) FROM MyTable [WHERE .....] -- WHERE is optional
            SET @RowsDeleted = @@ROWCOUNT
        END
    

    Generally, TRUNCATE is the best way and I'd use that if possible. But it cannot be used in all scenarios. Also, note that TRUNCATE will reset the IDENTITY value for the table if there is one.

    If you are using SQL 2000 or earlier, the TOP condition is not available, so you can use SET ROWCOUNT instead.

    DECLARE @RowsDeleted INTEGER
    SET @RowsDeleted = 1
    SET ROWCOUNT 10000 -- delete 10,000 rows a time
    
    WHILE (@RowsDeleted > 0)
        BEGIN
            DELETE FROM MyTable [WHERE .....] -- WHERE is optional
            SET @RowsDeleted = @@ROWCOUNT
        END
    
    0 讨论(0)
  • 2021-02-15 15:50

    The first has clearly better performance.

    When you specify DELETE [MyTable] it will simply erase everything without doing checks for ID. The second one will waste time and disk operation to locate a respective record each time before deleting it.

    It also gets worse because every time a record disappears from the middle of the table, the engine may want to condense data on disk, thus wasting time and work again.

    Maybe a better idea would be to delete data based on clustered index columns in descending order. Then the table will basically be truncated from the end at every delete operation.

    0 讨论(0)
  • 2021-02-15 15:52

    The first will delete all the data from the table and will have better performance that your second who will delete only data from a specific key.

    Now if you have to delete all the data from the table and you don't rely on using rollback think of the use a truncate table

    0 讨论(0)
  • 2021-02-15 15:59

    The best performance for clearing a table would bring TRUNCATE TABLE MyTable. See http://msdn.microsoft.com/en-us/library/ms177570.aspx for more verbose explaination

    0 讨论(0)
  • 2021-02-15 15:59

    Found this post on Microsoft TechNet.

    Basically, it recommends:

    1. By using SELECT INTO, copy the data that you want to KEEP to an intermediate table;
    2. Truncate the source table;
    3. Copy back with INSERT INTO from intermediate table, the data to the source table;

    ..

    BEGIN TRANSACTION
    
    SELECT  *
       INTO    dbo.bigtable_intermediate
       FROM    dbo.bigtable
       WHERE   Id % 2 = 0;
    
       TRUNCATE TABLE dbo.bigtable;  
    
       SET IDENTITY_INSERT dbo.bigTable ON;
       INSERT INTO dbo.bigtable WITH (TABLOCK) (Id, c1, c2, c3)
       SELECT Id, c1, c2, c3 FROM dbo.bigtable_intermediate ORDER BY Id;
       SET IDENTITY_INSERT dbo.bigtable OFF;
    ROLLBACK TRANSACTION
    
    0 讨论(0)
  • 2021-02-15 16:03

    If you have that many records in your table and you want to delete them all, you should consider truncate <table> instead of delete from <table>. It will be much faster, but be aware that it cannot activate a trigger.

    See for more details (this case sql server 2000): http://msdn.microsoft.com/en-us/library/aa260621%28SQL.80%29.aspx

    Deleting the table within the application row by row will end up in long long time, as your dbms can not optimize anything, as it doesn't know in advance, that you are going to delete everything.

    0 讨论(0)
提交回复
热议问题