How to detect duplicate rows in a SQL Server table?

前端 未结 5 600
渐次进展
渐次进展 2021-02-02 07:03

What is the most efficient way to detect duplicates in a 10 column / 50K row table? I\'m using MSSQL 8.0

相关标签:
5条回答
  • 2021-02-02 07:14

    Try this

    Select * From Table
    Group By [List all fields in the Table here]
    Having Count(*) > 1
    
    0 讨论(0)
  • 2021-02-02 07:22

    To show an example of what others have been describing:

    SELECT
        Col1, -- All of the columns you want to dedupe on
        Col2, -- which is not neccesarily all of the columns
        Col3, -- in the table
        Col4,
        Col5,
        Col6,
        Col7,
        Col8,
        Col9,
        Col10
    FROM
        MyTable
    GROUP BY
        Col1,
        Col2,
        Col3,
        Col4,
        Col5,
        Col6,
        Col7,
        Col8,
        Col9,
        Col10
    HAVING
        COUNT(*) > 1
    
    0 讨论(0)
  • 2021-02-02 07:29

    To detect, just group by as Guge said.

    select fieldA, fieldB, count(*) from table
    group by fieldA, fieldB
    having count(*) > 1
    

    If you want to delete dupes... pseudo....

    select distinct into a temp table
    truncate original table
    select temp table back into original table
    

    With truncate you may run into problems if you have FK constraints, so be smart about dropping constraints and making sure you don't orphan records.

    0 讨论(0)
  • 2021-02-02 07:30

    You can use group by on all columns and then count(*)>1

    0 讨论(0)
  • 2021-02-02 07:30

    In addition to the suggestions provided, I would then go to the effort of preventing duplicates in the future, rather than trying to locate them later.

    This is done using unique indexes on columns (or groups of columns) that are supposed to be unique. Remember that data in the database can be modified from other locations other than through the specific app that you are working on, so it's best to define what is and isn't allowed in a table at the DB level.

    0 讨论(0)
提交回复
热议问题