How to delete duplicates on a MySQL table?

后端 未结 25 2398
遇见更好的自我
遇见更好的自我 2020-11-22 01:35

I need to DELETE duplicated rows for specified sid on a MySQL table.

How can I do this with an SQL query?

         


        
相关标签:
25条回答
  • 2020-11-22 01:57

    Here is a simple answer:

    delete a from target_table a left JOIN (select max(id_field) as id, field_being_repeated  
        from target_table GROUP BY field_being_repeated) b 
        on a.field_being_repeated = b.field_being_repeated
          and a.id_field = b.id_field
        where b.id_field is null;
    
    0 讨论(0)
  • 2020-11-22 01:57
    DELETE T2
    FROM   table_name T1
    JOIN   same_table_name T2 ON (T1.title = T2.title AND T1.ID <> T2.ID)
    
    0 讨论(0)
  • 2020-11-22 02:02

    this removes duplicates in place, without making a new table

    ALTER IGNORE TABLE `table_name` ADD UNIQUE (title, SID)
    

    note: only works well if index fits in memory

    0 讨论(0)
  • 2020-11-22 02:02

    Following remove duplicates for all SID-s, not only single one.

    With temp table

    CREATE TABLE table_temp AS
    SELECT * FROM table GROUP BY title, SID;
    
    DROP TABLE table;
    RENAME TABLE table_temp TO table;
    

    Since temp_table is freshly created it has no indexes. You'll need to recreate them after removing duplicates. You can check what indexes you have in the table with SHOW INDEXES IN table

    Without temp table:

    DELETE FROM `table` WHERE id IN (
      SELECT all_duplicates.id FROM (
        SELECT id FROM `table` WHERE (`title`, `SID`) IN (
          SELECT `title`, `SID` FROM `table` GROUP BY `title`, `SID` having count(*) > 1
        )
      ) AS all_duplicates 
      LEFT JOIN (
        SELECT id FROM `table` GROUP BY `title`, `SID` having count(*) > 1
      ) AS grouped_duplicates 
      ON all_duplicates.id = grouped_duplicates.id 
      WHERE grouped_duplicates.id IS NULL
    )
    
    0 讨论(0)
  • 2020-11-22 02:03

    This always seems to work for me:

    CREATE TABLE NoDupeTable LIKE DupeTable; 
    INSERT NoDupeTable SELECT * FROM DupeTable group by CommonField1,CommonFieldN;
    

    Which keeps the lowest ID on each of the dupes and the rest of the non-dupe records.

    I've also taken to doing the following so that the dupe issue no longer occurs after the removal:

    CREATE TABLE NoDupeTable LIKE DupeTable; 
    Alter table NoDupeTable Add Unique `Unique` (CommonField1,CommonField2);
    INSERT IGNORE NoDupeTable SELECT * FROM DupeTable;
    

    In other words, I create a duplicate of the first table, add a unique index on the fields I don't want duplicates of, and then do an Insert IGNORE which has the advantage of not failing as a normal Insert would the first time it tried to add a duplicate record based on the two fields and rather ignores any such records.

    Moving fwd it becomes impossible to create any duplicate records based on those two fields.

    0 讨论(0)
  • 2020-11-22 02:04

    here is how I usually eliminate duplicates

    1. add a temporary column, name it whatever you want(i'll refer as active)
    2. group by the fields that you think shouldn't be duplicate and set their active to 1, grouping by will select only one of duplicate values(will not select duplicates)for that columns
    3. delete the ones with active zero
    4. drop column active
    5. optionally(if fits to your purposes), add unique index for those columns to not have duplicates again
    0 讨论(0)
提交回复
热议问题