sql select 3 columns and dedupe on two columns

问题

I have a job setup that currently selects records from a table that does not contain a unique index. I realize this could be solved by just putting an index on the table and the relevant columns but, in this scenario for testing purposes, I need to remove the index and then do a select which will also remove duplicates based on 2 columns:

SELECT DISTINCT [author], [pubDate], [dateadded]
FROM [Feeds].[dbo].[socialPosts]
WHERE CAST(FLOOR(CAST(dateadded AS float)) AS datetime) > 
                               DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE() - 2), 0)  
AND CAST(FLOOR(CAST(dateadded AS float)) AS datetime) < 
                               DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0)

This selects all records from the day before and I want to dedupe the records based on author and pubdate. This could be a post select or done prior but the idea is to find out if it can be done within a select.

回答1:

You can use a GROUP BY and any aggregate function on the dateadded column to get unique author, pubdate results.

SELECT  [author]
        ,[pubDate]
        ,MAX([dateadded])
 FROM   [Feeds].[dbo].[socialPosts]
 WHERE  CAST(FLOOR(CAST(dateadded AS float)) AS datetime) >  dateadd(day,datediff(day, 0, getdate()-2), 0)  
        AND CAST(FLOOR(CAST(dateadded AS float)) AS datetime) < dateadd(day,datediff(day, 0, getDate()), 0)
 GROUP BY 
        [author]
        , [pubdate]

来源：https://stackoverflow.com/questions/11402025/sql-select-3-columns-and-dedupe-on-two-columns

标签

sql

sql-server

sql-server-2008

duplicate-removal

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!