remove duplicate rows based on one column value

后端 未结 4 546
清歌不尽
清歌不尽 2021-01-04 08:42

I have the below table and now I need to delete the rows which are having duplicate \"refIDs\" but have atleast one row with that ref, i.e i need to remove row 4 and 5. plea

相关标签:
4条回答
  • 2021-01-04 08:56

    This is similar to Gordon Linoff's query, but without the subquery:

    DELETE t1 FROM table t1
      JOIN table t2
      ON t2.refID = t1.refID
      AND t2.ID < t1.ID
    

    This uses an inner join to only delete rows where there is another row with the same refID but lower ID.

    The benefit of avoiding a subquery is being able to utilize an index for the search. This query should perform well with a multi-column index on refID + ID.

    0 讨论(0)
  • 2021-01-04 09:04

    I would do:

    delete from t where 
    ID not in (select min(ID) from table t group by refID having count(*) > 1)
    and refID in (select refID from table t group by refID  having count(*) > 1)
    

    criteria is refId is among the duplicates and ID is different from the min(id) from the duplicates. It would work better if refId is indexed

    otherwise and provided you can issue multiple times the following query until it does not delete anything

    delete from t 
    where 
    ID in (select max(ID) from table t group by refID  having count(*) > 1) 
    
    0 讨论(0)
  • 2021-01-04 09:08

    In MySQL, you can do this with a join in delete:

    delete t
        from table t left join
             (select min(id) as id
              from table t
              group by refId
             ) tokeep
             on t.id = tokeep.id
        where tokeep.id is null;
    

    For each RefId, the subquery calculates the minimum of the id column (presumed to be unique over the whole table). It uses a left join for the match, so anything that doesn't match has a NULL value for tokeep.id. These are the ones that are deleted.

    0 讨论(0)
  • 2021-01-04 09:12

    Some another variant, in some cases a bit faster than Marcus and NJ73 answers:

    DELETE ourTable 
    FROM ourTable JOIN 
     (SELECT ID,targetField 
      FROM ourTable 
      GROUP BY targetField HAVING COUNT(*) > 1) t2 
    ON ourTable.targetField = t2.targetField AND ourTable.ID != t2.ID;
    

    Hope that will help someone. On big tables Marcus answer stalls.

    0 讨论(0)
提交回复
热议问题