I have an sql server database, that I pre-loaded with a ton of rows of data.
Unfortunately, there is no primary key in the database, and there is now duplicate informati
Well, this is one reason why you should have a primary key on the table. What version of SQL Server? For SQL Server 2005 and above:
;WITH r AS
(
SELECT col1, col2, col3, -- whatever columns make a "unique" row
rn = ROW_NUMBER() OVER (PARTITION BY col1, col2, col3 ORDER BY col1)
FROM dbo.SomeTable
)
DELETE r WHERE rn > 1;
Then, so you don't have to do this again tomorrow, and the next day, and the day after that, declare a primary key on the table.
Let's say your table is unique by COL1 and COL2.
Here is a way to do it:
SELECT *
FROM (SELECT COL1, COL2, ROW_NUMBER() OVER (PARTITION BY COL1, COL2 ORDER BY COL1, COL2 ASC) AS ROWID
FROM TABLE_NAME )T
WHERE T.ROWID > 1
The ROWID > 1 will enable you to select only the duplicated rows.