Get only latest row, grouped by a column

后端 未结 5 1463
醉酒成梦
醉酒成梦 2021-01-23 03:27

I have a large data-set of emails sent and status-codes.

ID Recipient           Date       Status
 1 someone@example.com 01/01/2010      1
 2 someone@example.com         


        
5条回答
  •  时光取名叫无心
    2021-01-23 04:09

    This is an example of a 'max per group' query. I think it is easiest to understand by splitting it up into two subqueries and then joining the results.

    The first subquery is what you already have.

    The second subquery uses the windowing function ROW_NUMBER to number the emails for each recipient starting with 1 for the most recent, then 2, 3, etc...

    The results from the first query are then joined with the result from the second query that has row number 1, i.e. the most recent. Doing it this way guarantees that you will only get one row for each recipient in the case that there are ties.

    Here is the query:

    SELECT T1.Recipient, T1.EmailCount, T2.Status FROM
    (
        SELECT Recipient, COUNT(*) AS EmailCount
        FROM Messages
        GROUP BY Recipient
    ) T1
    JOIN
    (
        SELECT
            Recipient,
            Status,
            ROW_NUMBER() OVER (PARTITION BY Recipient ORDER BY Date Desc) AS rn
        FROM Messages
    ) T2
    ON T1.Recipient = T2.Recipient AND T2.rn = 1
    

    This gives the following results:

    Recipient            EmailCount  Status  
    others@example.com   2           2       
    someone@example.com  2           1       
    them@example.com     3           1       
    

提交回复
热议问题