I have a table with the following fields:
id (Unique)
url (Unique)
title
company
site_id
Now, I need to remove rows having same titl
I keep visiting this page anytime I google "remove duplicates form mysql" but for my theIGNORE solutions don't work because I have an InnoDB mysql tables
this code works better anytime
CREATE TABLE tableToclean_temp LIKE tableToclean;
ALTER TABLE tableToclean_temp ADD UNIQUE INDEX (fontsinuse_id);
INSERT IGNORE INTO tableToclean_temp SELECT * FROM tableToclean;
DROP TABLE tableToclean;
RENAME TABLE tableToclean_temp TO tableToclean;
tableToclean = the name of the table you need to clean
tableToclean_temp = a temporary table created and deleted
If the IGNORE
statement won't work like in my case, you can use the below statement:
CREATE TABLE your_table_deduped LIKE your_table;
INSERT your_table_deduped
SELECT *
FROM your_table
GROUP BY index1_id,
index2_id;
RENAME TABLE your_table TO your_table_with_dupes;
RENAME TABLE your_table_deduped TO your_table;
#OPTIONAL
ALTER TABLE `your_table` ADD UNIQUE `unique_index` (`index1_id`, `index2_id`);
#OPTIONAL
DROP TABLE your_table_with_dupes;
A really easy way to do this is to add a UNIQUE
index on the 3 columns. When you write the ALTER
statement, include the IGNORE
keyword. Like so:
ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);
This will drop all the duplicate rows. As an added benefit, future INSERTs
that are duplicates will error out. As always, you may want to take a backup before running something like this...
I have this query snipet for SQLServer but I think It can be used in others DBMS with little changes:
DELETE
FROM Table
WHERE Table.idTable IN (
SELECT MAX(idTable)
FROM idTable
GROUP BY field1, field2, field3
HAVING COUNT(*) > 1)
I forgot to tell you that this query doesn't remove the row with the lowest id of the duplicated rows. If this works for you try this query:
DELETE
FROM jobs
WHERE jobs.id IN (
SELECT MAX(id)
FROM jobs
GROUP BY site_id, company, title, location
HAVING COUNT(*) > 1)
This solution will move the duplicates into one table and the uniques into another.
-- speed up creating uniques table if dealing with many rows
CREATE INDEX temp_idx ON jobs(site_id, company, title, location);
-- create the table with unique rows
INSERT jobs_uniques SELECT * FROM
(
SELECT *
FROM jobs
GROUP BY site_id, company, title, location
HAVING count(1) > 1
UNION
SELECT *
FROM jobs
GROUP BY site_id, company, title, location
HAVING count(1) = 1
) x
-- create the table with duplicate rows
INSERT jobs_dupes
SELECT *
FROM jobs
WHERE id NOT IN
(SELECT id FROM jobs_uniques)
-- confirm the difference between uniques and dupes tables
SELECT COUNT(1)
AS jobs,
(SELECT COUNT(1) FROM jobs_dupes) + (SELECT COUNT(1) FROM jobs_uniques)
AS sum
FROM jobs
I have a table which forget to add a primary key in the id row. Though is has auto_increment on the id. But one day, one stuff replay the mysql bin log on the database which insert some duplicate rows.
I remove the duplicate row by
select T1.* from table_name T1 inner join (select count(*) as c,id from table_name group by id) T2 on T1.id = T2.id where T2.c > 1 group by T1.id;
delete the duplicate rows by id
insert the row from the exported data.
Then add the primary key on id