mysql duplicate data deletion

问题

This shows me all the first names and last names that have exactly two entries that are identical

SELECT `firstname`,`lastname`,COUNT(*) AS Count 
FROM `people` 
GROUP BY `firstname`,`lastname`
HAVING Count = 2

How do I turn this into a DELETE FROM WHERE statement with a LIMIT to only remove one of each of the entries and leave the other one.

okay this appears to be way to technical i'm just going to do it in a php while loop

回答1:

You can create a table with 1 record of each of the duplicates: Then delete all the dup records from the people table and then re-insert the dup records.

-- Setup for example
create table people (fname varchar(10), lname varchar(10));

insert into people values ('Bob', 'Newhart');
insert into people values ('Bob', 'Newhart');
insert into people values ('Bill', 'Cosby');
insert into people values ('Jim', 'Gaffigan');
insert into people values ('Jim', 'Gaffigan');
insert into people values ('Adam', 'Sandler');

-- Show table with duplicates
select * from people;

-- Create table with one version of each duplicate record
create table dups as 
    select distinct fname, lname, count(*) 
    from people group by fname, lname 
    having count(*) > 1;

-- Delete all matching duplicate records
delete people from people inner join dups 
on people.fname = dups.fname AND 
   people.lname = dups.lname;

-- Insert single record of each dup back into table
insert into people select fname, lname from dups;

-- Show Fixed table
select * from people;

回答2:

if you have a primary key, such as id, you can do:

delete from people 
where id not in
(
      select minid from 
      (select min(id) as minid from people 
      group by firstname, lastname) as newtable
)

The subquery select min(id)... bit is getting you the unique (based on id) rows for a given firstname, lastname combination; and then you're deleting all other rows, i.e. your duplicates. You need to wrap your subquery due to a bug in mysql, otherwise we could do:

delete from people 
where id not in
(
      select min(id) as minid from people 
      group by firstname, lastname
)

better would be:

delete people from 
people left outer join
(
  select min(id) as minid from people 
  group by firstname, lastname
) people_grouped
on people.first_name = people_grouped.first_name
and people.last_name = people_grouped.last_name
and people_grouped.id is null

to avoid the subquery.

回答3:

Create a new table and add a unique key on (firstname,lastname). Then insert the rows in old table into the new table. Then rename the tables.

mysql> select * from t;
+-----------+----------+
| firstname | lastname |
+-----------+----------+
| A         | B        | 
| A         | B        | 
| X         | Y        | 
+-----------+----------+
3 rows in set (0.00 sec)

mysql> create table t2 like t;
Query OK, 0 rows affected (0.00 sec)

mysql> alter table t2 add unique key name(firstname,lastname);
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert ignore into t2 select * from t;
Query OK, 2 rows affected (0.00 sec)
Records: 3  Duplicates: 1  Warnings: 0


mysql> select * from t2;
+-----------+----------+
| firstname | lastname |
+-----------+----------+
| A         | B        | 
| X         | Y        | 
+-----------+----------+
2 rows in set (0.01 sec)

来源：https://stackoverflow.com/questions/2146938/mysql-duplicate-data-deletion

标签

mysql

select

duplicate-data