Trouble using ROW_NUMBER() OVER (PARTITION BY …)

后端 未结 3 724
花落未央
花落未央 2021-02-02 14:43

I\'m using SQL Server 2008 R2. I have table called EmployeeHistory with the following structure and sample data:

EmployeeID Date      DepartmentID SupervisorID
1         


        
3条回答
  •  终归单人心
    2021-02-02 15:28

    A bit involved. Easiest would be to refer to this SQL Fiddle I created for you that produces the exact result. There are ways you can improve it for performance or other considerations, but this should hopefully at least be clearer than some alternatives.

    The gist is, you get a canonical ranking of your data first, then use that to segment the data into groups, then find an end date for each group, then eliminate any intermediate rows. ROW_NUMBER() and CROSS APPLY help a lot in doing it readably.


    EDIT 2019:

    The SQL Fiddle does in fact seem to be broken, for some reason, but it appears to be a problem on the SQL Fiddle site. Here's a complete version, tested just now on SQL Server 2016:

    CREATE TABLE Source
    (
      EmployeeID int,
      DateStarted date,
      DepartmentID int
    )
    
    INSERT INTO Source
    VALUES
    (10001,'2013-01-01',001),
    (10001,'2013-09-09',001),
    (10001,'2013-12-01',002),
    (10001,'2014-05-01',002),
    (10001,'2014-10-01',001),
    (10001,'2014-12-01',001)
    
    
    SELECT *, 
      ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY DateStarted) AS EntryRank,
      newid() as GroupKey,
      CAST(NULL AS date) AS EndDate
    INTO #RankedData
    FROM Source
    ;
    
    UPDATE #RankedData
    SET GroupKey = beginDate.GroupKey
    FROM #RankedData sup
      CROSS APPLY 
      (
        SELECT TOP 1 GroupKey
        FROM #RankedData sub 
        WHERE sub.EmployeeID = sup.EmployeeID AND
          sub.DepartmentID = sup.DepartmentID AND
          NOT EXISTS 
            (
              SELECT * 
              FROM #RankedData bot 
              WHERE bot.EmployeeID = sup.EmployeeID AND
                bot.EntryRank BETWEEN sub.EntryRank AND sup.EntryRank AND
                bot.DepartmentID <> sup.DepartmentID
            )
          ORDER BY DateStarted ASC
        ) beginDate (GroupKey);
    
    UPDATE #RankedData
    SET EndDate = nextGroup.DateStarted
    FROM #RankedData sup
      CROSS APPLY 
      (
        SELECT TOP 1 DateStarted
        FROM #RankedData sub
        WHERE sub.EmployeeID = sup.EmployeeID AND
          sub.DepartmentID <> sup.DepartmentID AND
          sub.EntryRank > sup.EntryRank
        ORDER BY EntryRank ASC
      ) nextGroup (DateStarted);
    
    SELECT * FROM 
    (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY GroupKey ORDER BY EntryRank ASC) AS GroupRank FROM #RankedData
    ) FinalRanking
    WHERE GroupRank = 1
    ORDER BY EntryRank;
    
    DROP TABLE #RankedData
    DROP TABLE Source
    

提交回复
热议问题