How to remove duplicate items in MySQL with a dataset of 20 million rows?

后端 未结 3 2075
旧时难觅i
旧时难觅i 2021-01-15 21:51

I\'ve got big MySQL database. I need to delete the duplicate item quickly. Here\'s how it looks:

id | text1 | text2|    
1  | 23    |  43  |   
2  | 23    |          


        
相关标签:
3条回答
  • 2021-01-15 22:17
    DELETE FROM t WHERE id NOT IN
    (SELECT MIN(id) FROM t GROUP BY text1, text2)
    
    0 讨论(0)
  • 2021-01-15 22:26

    You may try this:

    ALTER IGNORE TABLE my_tablename ADD UNIQUE INDEX idx_name (text1 , text2);
    

    ie, try to add UNIQUE INDEX to your columns and alter the table

    This has an advantage that in future also there will be no duplicate rows which you can insert in your table

    0 讨论(0)
  • 2021-01-15 22:33

    Run this:

    SELECT COUNT(*), text1, text2
    GROUP BY text1, text2
    HAVING COUNT(*) > 1;
    

    When you find rows here, delete one row for each match, and then run it again.

    I'm not sure what it will be like in terms of performance - perhaps it doesn't matter, if you do this offline?

    0 讨论(0)
提交回复
热议问题