Find most recent duplicates ID with MySQL

蓝咒 提交于 2019-12-19 20:00:09

问题


I use to do

SELECT email, COUNT(email) AS occurences
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);

to find duplicates based on their email.

But now I'd need their ID to be able to define which one to remove exactly.

The second constraint is: I want only the LAST INSERTED duplicates.

So if there's 2 entries with test@test.com as an email and their IDs are respectively 40 and 12782 it would delete only the 12782 entry and keep the 40 one.

Any ideas on how I could do this? I've been mashing SQL for about a hour and can't seem to find exactly how to do this.

Thanks and have a nice day!


回答1:


Well, you sort of answer your question. You seem to want max(id):

SELECT email, COUNT(email) AS occurences, max(id)
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);

You can delete the others using the statement. Delete with join has a tricky syntax where you have to list the table name first and then specify the from clause with the join:

delete wineries
            from wineries join
            (select email, max(id) as maxid
             from wineries
             group by email
             having count(*) > 1
            ) we
            on we.email = wineries.email and
               wineries.id < we.maxid;

Or writing this as an exists clause:

delete from wineries
    where exists (select 1
                  from (select email, max(id) as maxid
                        from wineries
                        group by email
                       ) we
                  where we.email = wineries.email and wineries.id < we.maxid
                 )



回答2:


select email, max(id), COUNT(email) AS occurences
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);



回答3:


delete from wineries
where id not in
(
  select * from 
  ( 
     select min(id)
     from wineries
     group by email
  ) x
)

You need a subquery to trick MySQL to delete from a table it is selecting from at the same time.




回答4:


DELETE duplicates.*
FROM wineries
JOIN wineries AS duplicates USING (email)
WHERE duplicates.id < wineries.id;

play with it on sqlfiddle.com




回答5:


This is the simplest option:

DELETE FROM wineries
 WHERE id NOT IN
(
  SELECT MIN(id) id
    FROM wineries
GROUP BY email 
);

This will only keep the first inserted record for each email address, all other records will be deleted. Credit for this answer should go to @juergen d since this is just a revised version of his answer.



来源:https://stackoverflow.com/questions/17618415/find-most-recent-duplicates-id-with-mysql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!