How can I delete duplicate rows in a table

后端 未结 13 1354
情歌与酒
情歌与酒 2020-12-08 22:30

I have a table with say 3 columns. There\'s no primary key so there can be duplicate rows. I need to just keep one and delete the others. Any idea how to do this is Sql Serv

相关标签:
13条回答
  • 2020-12-08 23:19

    I'd SELECT DISTINCT the rows and throw them into a temporary table, then drop the source table and copy back the data from the temp. EDIT: now with code snippet!

    INSERT INTO TABLE_2 
    SELECT DISTINCT * FROM TABLE_1
    GO
    DELETE FROM TABLE_1
    GO
    INSERT INTO TABLE_1
    SELECT * FROM TABLE_2
    GO
    
    0 讨论(0)
  • 2020-12-08 23:20

    After you clean up the current mess you could add a primary key that includes all the fields in the table. that will keep you from getting into the mess again. Of course this solution could very well break existing code. That will have to be handled as well.

    0 讨论(0)
  • 2020-12-08 23:20

    I'm not sure if this works with DELETE statements, but this is a way to find duplicate rows:

     SELECT *
     FROM myTable t1, myTable t2
     WHERE t1.field = t2.field AND t1.id > t2.id
    

    I'm not sure if you can just change the "SELECT" to a "DELETE" (someone wanna let me know?), but even if you can't, you could just make it into a subquery.

    0 讨论(0)
  • 2020-12-08 23:21

    What about this solution :

    First you execute the following query :

      select 'set rowcount ' + convert(varchar,COUNT(*)-1) + ' delete from MyTable where field=''' + field +'''' + ' set rowcount 0'  from mytable group by field having COUNT(*)>1
    

    And then you just have to execute the returned result set

    set rowcount 3 delete from Mytable where field='foo' set rowcount 0
    ....
    ....
    set rowcount 5 delete from Mytable where field='bar' set rowcount 0
    

    I've handled the case when you've got only one column, but it's pretty easy to adapt the same approach tomore than one column. Let me know if you want me to post the code.

    0 讨论(0)
  • 2020-12-08 23:23

    Here's another way, with test data

    create table #table1 (colWithDupes1 int, colWithDupes2 int)
    insert into #table1
    (colWithDupes1, colWithDupes2)
    Select 1, 2 union all
    Select 1, 2 union all
    Select 2, 2 union all
    Select 3, 4 union all
    Select 3, 4 union all
    Select 3, 4 union all
    Select 4, 2 union all
    Select 4, 2 
    
    
    select * from #table1
    
    set rowcount 1
    select 1
    
    while @@rowcount > 0
    delete #table1  where 1 < (select count(*) from #table1 a2 
       where #table1.colWithDupes1 = a2.colWithDupes1
    and #table1.colWithDupes2 = a2.colWithDupes2
    )
    
    set rowcount 0
    
    select * from #table1
    
    0 讨论(0)
  • 2020-12-08 23:27

    Add an identity column to act as a surrogate primary key, and use this to identify two of the three rows to be deleted.

    I would consider leaving the identity column in place afterwards, or if this is some kind of link table, create a compound primary key on the other columns.

    0 讨论(0)
提交回复
热议问题