Get top 1 row of each group

后端 未结 20 3042
余生分开走
余生分开走 2020-11-21 04:42

I have a table which I want to get the latest entry for each group. Here\'s the table:

DocumentStatusLogs Table

|ID| DocumentID | Status         


        
20条回答
  •  北恋
    北恋 (楼主)
    2020-11-21 05:25

    Here are 3 separate approaches to the problem in hand along with the best choices of indexing for each of those queries (please try out the indexes yourselves and see the logical read, elapsed time, execution plan. I have provided the suggestions from my experience on such queries without executing for this specific problem).

    Approach 1: Using ROW_NUMBER(). If rowstore index is not being able to enhance the performance, you can try out nonclustered/clustered columnstore index as for queries with aggregation and grouping and for tables which are ordered by in different columns all the times, columnstore index usually is the best choice.

    ;WITH CTE AS
        (
           SELECT   *,
                    RN = ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC)
           FROM     DocumentStatusLogs
        )
        SELECT  ID      
            ,DocumentID 
            ,Status     
            ,DateCreated
        FROM    CTE
        WHERE   RN = 1;
    

    Approach 2: Using FIRST_VALUE. If rowstore index is not being able to enhance the performance, you can try out nonclustered/clustered columnstore index as for queries with aggregation and grouping and for tables which are ordered by in different columns all the times, columnstore index usually is the best choice.

    SELECT  DISTINCT
        ID      = FIRST_VALUE(ID) OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC)
        ,DocumentID
        ,Status     = FIRST_VALUE(Status) OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC)
        ,DateCreated    = FIRST_VALUE(DateCreated) OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC)
    FROM    DocumentStatusLogs;
    

    Approach 3: Using CROSS APPLY. Creating rowstore index on DocumentStatusLogs table covering the columns used in the query should be enough to cover the query without need of a columnstore index.

    SELECT  DISTINCT
        ID      = CA.ID
        ,DocumentID = D.DocumentID
        ,Status     = CA.Status 
        ,DateCreated    = CA.DateCreated
    FROM    DocumentStatusLogs D
        CROSS APPLY (
                SELECT  TOP 1 I.*
                FROM    DocumentStatusLogs I
                WHERE   I.DocumentID = D.DocumentID
                ORDER   BY I.DateCreated DESC
                ) CA;
    

提交回复
热议问题