I have a table with the following fields:
id (Unique)
url (Unique)
title
company
site_id
Now, I need to remove rows having same titl
There is another solution :
DELETE t1 FROM my_table t1, my_table t2 WHERE t1.id < t2.id AND t1.my_field = t2.my_field AND t1.my_field_2 = t2.my_field_2 AND ...
As of version 8.0 (2018), MySQL finally supports window functions.
Window functions are both handy and efficient. Here is a solution that demonstrates how to use them to solve this assignment.
In a subquery, we can use ROW_NUMBER() to assign a position to each record in the table within column1/column2
groups, ordered by id
. If there is no duplicates, the record will get row number 1
. If duplicate exists, they will be numbered by ascending id
(starting at 1
).
Once records are properly numbered in the subquery, the outer query just deletes all records whose row number is not 1.
Query :
DELETE FROM tablename
WHERE id IN (
SELECT id
FROM (
SELECT
id,
ROW_NUMBER() OVER(PARTITION BY column1, column2 ORDER BY id) rn
FROM output
) t
WHERE rn > 1
)
Simple and fast for all cases:
CREATE TEMPORARY TABLE IF NOT EXISTS _temp_duplicates AS (SELECT dub.id FROM table_with_duplications dub GROUP BY dub.field_must_be_uniq_1, dub.field_must_be_uniq_2 HAVING COUNT(*) > 1);
DELETE FROM table_with_duplications WHERE id IN (SELECT id FROM _temp_duplicates);
Delete duplicate rows with the DELETE JOIN
statement:
DELETE t1 FROM table_name t1
JOIN table_name t2
WHERE
t1.id < t2.id AND
t1.title = t2.title AND t1.company = t2.company AND t1.site_id = t2.site_id;
If you don't want to alter the column properties, then you can use the query below.
Since you have a column which has unique IDs (e.g., auto_increment
columns), you can use it to remove the duplicates:
DELETE `a`
FROM
`jobs` AS `a`,
`jobs` AS `b`
WHERE
-- IMPORTANT: Ensures one version remains
-- Change "ID" to your unique column's name
`a`.`ID` < `b`.`ID`
-- Any duplicates you want to check for
AND (`a`.`title` = `b`.`title` OR `a`.`title` IS NULL AND `b`.`title` IS NULL)
AND (`a`.`company` = `b`.`company` OR `a`.`company` IS NULL AND `b`.`company` IS NULL)
AND (`a`.`site_id` = `b`.`site_id` OR `a`.`site_id` IS NULL AND `b`.`site_id` IS NULL);
In MySQL, you can simplify it even more with the NULL-safe equal operator (aka "spaceship operator"):
DELETE `a`
FROM
`jobs` AS `a`,
`jobs` AS `b`
WHERE
-- IMPORTANT: Ensures one version remains
-- Change "ID" to your unique column's name
`a`.`ID` < `b`.`ID`
-- Any duplicates you want to check for
AND `a`.`title` <=> `b`.`title`
AND `a`.`company` <=> `b`.`company`
AND `a`.`site_id` <=> `b`.`site_id`;
Delete duplicate rows using DELETE JOIN statement MySQL provides you with the DELETE JOIN statement that you can use to remove duplicate rows quickly.
The following statement deletes duplicate rows and keeps the highest id:
DELETE t1 FROM contacts t1
INNER JOIN
contacts t2 WHERE
t1.id < t2.id AND t1.email = t2.email;