Can I delete database duplicates based on multiple columns?

后端 未结 2 779
死守一世寂寞
死守一世寂寞 2021-02-13 18:54

I asked this question a while back to delete duplicate records based on a column. The answer worked great:

delete from tbl
where id NOT in
(
select  min(id)
fro         


        
相关标签:
2条回答
  • 2021-02-13 19:36

    Try this one. I created a table tblA with three columns.

    CREATE TABLE tblA
    (
    id int IDENTITY(1, 1),
    colA int, 
    colB int, 
    colC int
    )
    

    And added some duplicate values.

    INSERT INTO tblA VALUES (1, 2, 3)
    INSERT INTO tblA VALUES (1, 2, 3)
    INSERT INTO tblA VALUES (4, 5, 6)
    INSERT INTO tblA VALUES (7, 8, 9)
    INSERT INTO tblA VALUES (7, 8, 9)
    

    If you replace the select with a delete in the statement below you will have your multiple column delete working.

    SELECT MIN(Id) as id
    FROM
    (
    SELECT COUNT(*) as aantal, a.colA, a.colB, a.colC
    FROM tblA       a
    INNER JOIN tblA b   ON b.ColA = a.ColA
                        AND b.ColB = a.ColB
                        AND b.ColC = a.ColC
    GROUP BY a.id, a.colA, a.colB, a.colC
    HAVING COUNT(*) > 1
    ) c
    INNER JOIN tblA d ON d.ColA = c.ColA
                        AND d.ColB = c.ColB
                        AND d.ColC = c.ColC
    GROUP BY d.colA, d.colB, d.colC
    
    0 讨论(0)
  • 2021-02-13 19:40

    This shows the rows you want to keep:

    ;WITH x AS 
    (
      SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
          (PARTITION BY col1, col2, col3 ORDER BY id)
      FROM dbo.tbl
    )
    SELECT col1, col2, col3 FROM x WHERE rn = 1;
    

    This shows the rows you want to delete:

    ;WITH x AS 
    (
      SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
          (PARTITION BY col1, col2, col3 ORDER BY id)
      FROM dbo.tbl
    )
    SELECT col1, col2, col3 FROM x WHERE rn > 1;
    

    And once you're happy that the above two sets are correct, the following will actually delete them:

    ;WITH x AS 
    (
      SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
          (PARTITION BY col1, col2, col3 ORDER BY id)
      FROM dbo.tbl
    )
    DELETE x WHERE rn > 1;
    

    Note that in all three queries, the first 6 lines are identical, and only the subsequent query after the CTE has changed.

    0 讨论(0)
提交回复
热议问题