How to delete completely duplicate rows

后端 未结 4 1692
孤街浪徒
孤街浪徒 2021-01-18 05:31

Say i have duplicate rows in my table and well my database design is of 3rd class :-

Insert Into tblProduct (ProductId,ProductName,Description,Category) Valu         


        
相关标签:
4条回答
  • 2021-01-18 05:39

    First use a SELECT... INTO:

    SELECT DISTINCT ProductID, ProductName, Description, Category
        INTO tblProductClean
        FROM tblProduct
    

    The drop the first table.

    0 讨论(0)
  • 2021-01-18 05:46
    DELETE tblProduct 
    FROM tblProduct 
    LEFT OUTER JOIN (
       SELECT MIN(ProductId) as ProductId, ProductName, Description, Category
       FROM tblProduct 
       GROUP BY ProductName, Description, Category
    ) as KeepRows ON
       tblProduct.ProductId= KeepRows.ProductId
    WHERE
       KeepRows.ProductId IS NULL
    

    Stolen from How can I remove duplicate rows?

    UPDATE:

    This will only work if ProductId is a Primary Key (which it is not). You are better off using @marc_s' method, but I'll leave this up in case someone using a PK comes across this post.

    0 讨论(0)
  • 2021-01-18 05:51

    Try this - it will delete all duplicates from your table:

    ;WITH duplicates AS
    (
        SELECT 
           ProductID, ProductName, Description, Category,
           ROW_NUMBER() OVER (PARTITION BY ProductID, ProductName
                              ORDER BY ProductID) 'RowNum'
        FROM dbo.tblProduct
    )
    DELETE FROM duplicates
    WHERE RowNum > 1
    GO
    
    SELECT * FROM dbo.tblProduct
    GO
    

    Your duplicates should be gone now: output is:

    ProductID   ProductName   DESCRIPTION        Category
       1          Cinthol         cosmetic soap      soap
       1          Lux             cosmetic soap      soap
       1          Crowning Glory  cosmetic soap      soap
       2          Cinthol         nice soap          soap
       3          Lux             nice soap          soap
    
    0 讨论(0)
  • 2021-01-18 05:53

    I had to do this a few weeks back... what version of SQL Server are you using? In SQL Server 2005 and up, you can use Row_Number as part of your select, and only select where Row_Number is 1. I forget the exact syntax, but it's well documented... something along the lines of:

    Select t0.ProductID, 
           t0.ProductName, 
           t0.Description, 
           t0.Category
    Into   tblCleanData
    From   (
        Select ProductID, 
               ProductName, 
               Description, 
               Category, 
               Row_Number() Over (
                   Partition By ProductID, 
                                ProductName, 
                                Description, 
                                Category
                   Order By     ProductID,
                                ProductName,
                                Description,
                                Category
               ) As RowNumber
        From   MyTable
    ) As t0
    Where t0.RowNumber = 1
    

    Check out http://msdn.microsoft.com/en-us/library/ms186734.aspx, that should get you going in the right direction.

    0 讨论(0)
提交回复
热议问题