Remove duplicate rows in MySQL

前端 未结 25 3264
囚心锁ツ
囚心锁ツ 2020-11-21 04:33

I have a table with the following fields:

id (Unique)
url (Unique)
title
company
site_id

Now, I need to remove rows having same titl

相关标签:
25条回答
  • 2020-11-21 05:06

    There is another solution :

    DELETE t1 FROM my_table t1, my_table t2 WHERE t1.id < t2.id AND t1.my_field = t2.my_field AND t1.my_field_2 = t2.my_field_2 AND ...
    
    0 讨论(0)
  • 2020-11-21 05:07

    As of version 8.0 (2018), MySQL finally supports window functions.

    Window functions are both handy and efficient. Here is a solution that demonstrates how to use them to solve this assignment.

    In a subquery, we can use ROW_NUMBER() to assign a position to each record in the table within column1/column2 groups, ordered by id. If there is no duplicates, the record will get row number 1. If duplicate exists, they will be numbered by ascending id (starting at 1).

    Once records are properly numbered in the subquery, the outer query just deletes all records whose row number is not 1.

    Query :

    DELETE FROM tablename
    WHERE id IN (
        SELECT id
        FROM (
            SELECT 
                id, 
                ROW_NUMBER() OVER(PARTITION BY column1, column2 ORDER BY id) rn
            FROM output
        ) t
        WHERE rn > 1
    )
    
    0 讨论(0)
  • 2020-11-21 05:08

    Simple and fast for all cases:

    CREATE TEMPORARY TABLE IF NOT EXISTS _temp_duplicates AS (SELECT dub.id FROM table_with_duplications dub GROUP BY dub.field_must_be_uniq_1, dub.field_must_be_uniq_2 HAVING COUNT(*)  > 1);
    
    DELETE FROM table_with_duplications WHERE id IN (SELECT id FROM _temp_duplicates);
    
    0 讨论(0)
  • 2020-11-21 05:08

    Delete duplicate rows with the DELETE JOIN statement:

    DELETE t1 FROM table_name t1
    JOIN table_name t2
    WHERE
        t1.id < t2.id AND
        t1.title = t2.title AND t1.company = t2.company AND t1.site_id = t2.site_id;
    
    0 讨论(0)
  • 2020-11-21 05:10

    If you don't want to alter the column properties, then you can use the query below.

    Since you have a column which has unique IDs (e.g., auto_increment columns), you can use it to remove the duplicates:

    DELETE `a`
    FROM
        `jobs` AS `a`,
        `jobs` AS `b`
    WHERE
        -- IMPORTANT: Ensures one version remains
        -- Change "ID" to your unique column's name
        `a`.`ID` < `b`.`ID`
    
        -- Any duplicates you want to check for
        AND (`a`.`title` = `b`.`title` OR `a`.`title` IS NULL AND `b`.`title` IS NULL)
        AND (`a`.`company` = `b`.`company` OR `a`.`company` IS NULL AND `b`.`company` IS NULL)
        AND (`a`.`site_id` = `b`.`site_id` OR `a`.`site_id` IS NULL AND `b`.`site_id` IS NULL);
    

    In MySQL, you can simplify it even more with the NULL-safe equal operator (aka "spaceship operator"):

    DELETE `a`
    FROM
        `jobs` AS `a`,
        `jobs` AS `b`
    WHERE
        -- IMPORTANT: Ensures one version remains
        -- Change "ID" to your unique column's name
        `a`.`ID` < `b`.`ID`
    
        -- Any duplicates you want to check for
        AND `a`.`title` <=> `b`.`title`
        AND `a`.`company` <=> `b`.`company`
        AND `a`.`site_id` <=> `b`.`site_id`;
    
    0 讨论(0)
  • 2020-11-21 05:12

    Delete duplicate rows using DELETE JOIN statement MySQL provides you with the DELETE JOIN statement that you can use to remove duplicate rows quickly.

    The following statement deletes duplicate rows and keeps the highest id:

    DELETE t1 FROM contacts t1
        INNER JOIN
    contacts t2 WHERE
    t1.id < t2.id AND t1.email = t2.email;
    
    0 讨论(0)
提交回复
热议问题