Retrieving last record in each group from database with additional max() condition in MSSQL

问题

This is a follow-up question to Retrieving last record in each group from database - SQL Server 2005/2008

In the answers, this example was provided to retrieve last record for a group of parameters (example below retrieves last updates for each value in computername):

 select t.*
 from t
 where t.lastupdate = (select max(t2.lastupdate)
                  from t t2
                  where t2.computername = t.computername
                 );

In my case, however, "lastupdate" is not unique (some updates come in batches and have same lastupdate value, and if two updates of "computername" come in the same batch, you will get non-unique output for "computername + lastupdate"). Suppose I also have field "rowId" that is just auto-incremental. The mitigation would be to include in the query another criterion for a max('rowId') field.

NB: while the example employs time-specific name "lastupdate", the actual selection criteria may not be related to the time at all.

I, therefore, like to ask, what would be the most performant query that selects the last record in each group based both on "group-defining parameter" (in the case above, "computername") and on maximal rowId?

回答1:

If you don't have uniqueness, then row_number() is simpler:

 select t.*
 from (select t.*,
              row_number() over (partition by computername order by lastupdate, rowid desc) as seqnum
       from t
      ) t
where seqnum = 1;

With the right indexes, the correlated subquery is usually faster. However, the performance difference is not that great.

来源：https://stackoverflow.com/questions/50467068/retrieving-last-record-in-each-group-from-database-with-additional-max-conditi

标签

sql

sql-server

sqlperformance