Remove duplicates in SQL Result set of ONE table

狂风中的少年 提交于 2019-12-11 09:57:15

问题


Afternoon/Evening all,

I'm looking for the final touches to the below query. I need to remove the duplicate occurrences of a column in a particular row. Currently using the below SQL:

SELECT CBNEW.*
FROM CallbackNewID CBNEW 
INNER JOIN (SELECT IDNEW, MAX(CallbackDate) AS MaxDate 
FROM CallbackNewID 
GROUP BY IDNEW) AS groupedCBNEW 
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) AND (CBNEW.IDNEW = groupedCBNEW.IDNEW);

My result set looks like the below

ID     RecID  Comp  Rem Date_               IDNEW   IDOLD   CB? CallbackDate
138618  83209   1   0   2012-03-16 12:40:00 83209   83209   2   16-Mar-12
138619  83209   1   0   2012-03-16 12:40:00 83209   83209   2   16-Mar-12
110470  83799   1   0   2011-07-27 11:46:00 83799   83799   10  27-Jul-11
110471  83799   1   0   2011-07-27 11:46:00 83799   83799   10  27-Jul-11

This however gives me duplicate values in the CallBackDate and IDNEW Column because in the table there are some different Primary Keys with the same IDNEW and CallbackDate values.

If I dump this result into Excel, I can just use remove duplicates on the first ID column, and the problem's solved.

But what I want to do is make sure my result only includes the FIRST instance of the ID column, where IDNEW and CallbackDate are duplicated.

I'm sure I just need to append a tiny piece of SQL, but I'm stuck if I can find the answer so far.

Your help is very much appreciated.


回答1:


Try adding MIN(ID) to the inner query and then adding it also on the ON clause:

SELECT CBNEW.*
FROM CallbackNewID CBNEW 
INNER JOIN (SELECT IDNEW, MIN(ID) AS MinId, MAX(CallbackDate) AS MaxDate 
FROM CallbackNewID 
GROUP BY IDNEW) AS groupedCBNEW 
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) 
   AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
   AND (CBNEW.ID = groupedCBNEW.MinId) ;

sqlfiddle demo




回答2:


Here is a rather "brute force" approach. It just takes the results of your original query and does Min() on [ID], Max() on [Comp] and [Rem], and GROUP BY on everything else:

SELECT 
    Min(t.ID) AS MinOfID, 
    t.RecID, 
    Max(t.Comp) AS MaxOfComp, 
    Max(t.Rem) AS MaxOfRem, 
    t.Date_, 
    t.IDNEW, 
    t.IDOLD, 
    t.[CB?], 
    t.CallbackDate
FROM 
    (
        SELECT CBNEW.*
        FROM 
            CallbackNewID CBNEW 
            INNER JOIN 
            (
                SELECT IDNEW, MAX(CallbackDate) AS MaxDate 
                FROM CallbackNewID 
                GROUP BY IDNEW
            ) AS groupedCBNEW 
                ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) 
                AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
    ) t
GROUP BY 
    t.RecID, 
    t.Date_, 
    t.IDNEW, 
    t.IDOLD, 
    t.[CB?], 
    t.CallbackDate;

It might not be terribly elegant, but if it works....




回答3:


In MS SQL Server, I think you are looking for the ROW_NUMBER() function.

Something like this should help you get what you are looking for:

SELECT
    X.*
FROM
    (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY DBNEW.IDNEW, DBNEW.MaxDate) [row_num]
        FROM
            CallbackNewID CBNEW 
            INNER JOIN 
            (
                SELECT
                    IDNEW,
                    MAX(CallbackDate) AS MaxDate
                FROM
                    CallbackNewID 
                GROUP BY
                    IDNEW
            ) AS groupedCBNEW ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
    ) X
WHERE
    X.row_num = 1



回答4:


SELECT
    A.*
FROM
    (SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY IDNEW ORDER BY CallbackDate DESC)
                          AS [row_num]
     FROM CallbackNewID 
    ) A
WHERE
    A.row_num = 1


来源:https://stackoverflow.com/questions/19774273/remove-duplicates-in-sql-result-set-of-one-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!