How to delete duplicates on a MySQL table?

后端 未结 25 2421
遇见更好的自我
遇见更好的自我 2020-11-22 01:35

I need to DELETE duplicated rows for specified sid on a MySQL table.

How can I do this with an SQL query?

         


        
相关标签:
25条回答
  • 2020-11-22 01:48

    Another easy way... using UPDATE IGNORE:

    U have to use an index on one or more columns (type index). Create a new temporary reference column (not part of the index). In this column, you mark the uniques in by updating it with ignore clause. Step by step:

    Add a temporary reference column to mark the uniques:

    ALTER TABLE `yourtable` ADD `unique` VARCHAR(3) NOT NULL AFTER `lastcolname`;
    

    => this will add a column to your table.

    Update the table, try to mark everything as unique, but ignore possible errors due to to duplicate key issue (records will be skipped):

    UPDATE IGNORE `yourtable` SET `unique` = 'Yes' WHERE 1;
    

    => you will find your duplicate records will not be marked as unique = 'Yes', in other words only one of each set of duplicate records will be marked as unique.

    Delete everything that's not unique:

    DELETE * FROM `yourtable` WHERE `unique` <> 'Yes';
    

    => This will remove all duplicate records.

    Drop the column...

    ALTER TABLE `yourtable` DROP `unique`;
    
    0 讨论(0)
  • 2020-11-22 01:48

    Deleting duplicates on MySQL tables is a common issue, that usually comes with specific needs. In case anyone is interested, here (Remove duplicate rows in MySQL) I explain how to use a temporary table to delete MySQL duplicates in a reliable and fast way, also valid to handle big data sources (with examples for different use cases).

    Ali, in your case, you can run something like this:

    -- create a new temporary table
    CREATE TABLE tmp_table1 LIKE table1;
    
    -- add a unique constraint    
    ALTER TABLE tmp_table1 ADD UNIQUE(sid, title);
    
    -- scan over the table to insert entries
    INSERT IGNORE INTO tmp_table1 SELECT * FROM table1 ORDER BY sid;
    
    -- rename tables
    RENAME TABLE table1 TO backup_table1, tmp_table1 TO table1;
    
    0 讨论(0)
  • 2020-11-22 01:52

    Suppose you have a table employee, with the following columns:

    employee (first_name, last_name, start_date)
    

    In order to delete the rows with a duplicate first_name column:

    delete
    from employee using employee,
        employee e1
    where employee.id > e1.id
        and employee.first_name = e1.first_name  
    
    0 讨论(0)
  • 2020-11-22 01:53

    This work for me to remove old records:

    delete from table where id in 
    (select min(e.id)
        from (select * from table) e 
        group by column1, column2
        having count(*) > 1
    ); 
    

    You can replace min(e.id) to max(e.id) to remove newest records.

    0 讨论(0)
  • 2020-11-22 01:56

    This procedure will remove all duplicates (incl multiples) in a table, keeping the last duplicate. This is an extension of Retrieving last record in each group

    Hope this is useful to someone.

    DROP TABLE IF EXISTS UniqueIDs;
    CREATE Temporary table UniqueIDs (id Int(11));
    
    INSERT INTO UniqueIDs
        (SELECT T1.ID FROM Table T1 LEFT JOIN Table T2 ON
        (T1.Field1 = T2.Field1 AND T1.Field2 = T2.Field2 #Comparison Fields 
        AND T1.ID < T2.ID)
        WHERE T2.ID IS NULL);
    
    DELETE FROM Table WHERE id NOT IN (SELECT ID FROM UniqueIDs);
    
    0 讨论(0)
  • 2020-11-22 01:56

    If you want to keep the row with the lowest id value:

     DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id > n2.id AND n1.email = n2.email
    

    If you want to keep the row with the highest id value:

     DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id < n2.id AND n1.email = n2.email
    
    0 讨论(0)
提交回复
热议问题