How do Recursive CTEs work in SQL Server?

最后都变了- 提交于 2020-02-02 11:25:28

问题


Can anyone help me understand how this recursive CTE works?

WITH
RECURSIVECTE (EMPID, FULLNAME, MANAGERID, [ORGLEVEL]) AS
    (SELECT EMPID,
            FULLNAME,
            MANAGERID,
            1
     FROM RECURSIVETBL
     WHERE MANAGERID IS NULL
     UNION ALL
     SELECT A.EMPID,
            A.FULLNAME,
            A.MANAGERID,
            B.[ORGLEVEL] + 1
     FROM RECURSIVETBL A
          JOIN RECURSIVECTE B ON A.MANAGERID = B.EMPID)
SELECT *
FROM RECURSIVECTE;

回答1:


Recursive CTEs in SQL Server have 2 parts:

The Anchor: Is the starting point of your recursion. It's a set that will be further expanded by recursive joins.

SELECT 
    EMPID,
    FULLNAME,
    MANAGERID,
    1 AS ORGLEVEL
FROM 
    RECURSIVETBL
WHERE 
    MANAGERID IS NULL

It seems that it's fetching all employees that don't have any managers (would be the top bosses, or roots from tree relationships).

The recursion: Linked with a UNION ALL, this set has to reference the declaring CTE (thus making it recursive). Think of it as how will you expand the result of the anchor with the next level.

UNION ALL

SELECT 
    A.EMPID,
    A.FULLNAME,
    A.MANAGERID,
    B.[ORGLEVEL] + 1
FROM 
    RECURSIVETBL A
    JOIN RECURSIVECTE B  -- Notice that we are referencing "RECURSIVECTE" which is the CTE we are declaring
    ON A.MANAGERID = B.EMPID

On this example, we are fetching (on a first iteration) the anchor result set (all employees with no managers) and joining them with RECURSIVETBL through MANAGERID, so A.EMPID will hold the employee of the previously selected manager. This joining goes on and on as long as each last result set can generate new rows.

There are a few limitations on what you can put on the recursive part (no grouping or another nested recursion for example). Also, as it's preceeded with a UNION ALL, it's rules also apply (amount of columns and data types must match).

About the ORGLEVEL, it starts with the anchor set at 1 (it's hard-coded there). When it's further expanded on the recursion set, it fetches the previous set (the anchor, on the first iteration) and it adds 1, as it's expression is B.[ORGLEVEL] + 1 with B being the previous set. This means that it starts with 1 (the top bosses) and it keeps adding 1 for each descendant, thus representing all the levels of the organization.

When you find an employee at ORGLEVEL = 3 means that he has 2 managers over him.


Step by step with working example

Let's follow this example:

EmployeeID  ManagerID
1           NULL
2           1
3           1
4           2
5           2
6           1
7           6
8           6
9           NULL
10          3
11          3
12          10
13          9
14          9
15          13
  1. Anchor: Employees without managers (ManagerID IS NULL). This will start with all the top badass of your company. It's crucial to note that if the anchor set is empty, then the whole recursive CTE will be empty, as there is no starting point and no recursive set to join to.

    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = NULL, -- Always null by WHERE filter
        HierarchyLevel = 1,
        HierarchyRoute = CONVERT(VARCHAR(MAX), E.EmployeeID)
    FROM
        Employee AS E
    WHERE
        E.ManagerID IS NULL
    

Which are these:

EmployeeID  ManagerID   HierarchyLevel  HierarchyRoute
1           (null)      1               1
9           (null)      1               9
  1. Recursion N°1: Using this UNION ALL recursion:

    UNION ALL
    
    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = E.ManagerID,
        HierarchyLevel = R.HierarchyLevel + 1,
        HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
    FROM
        RecursiveCTE AS R
        INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
    

For this INNER JOIN, RecursiveCTE has 2 rows (the anchor set), with employees ID 1 and 9. So this JOIN will actually return this result.

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14

See how the HierarchyRoute starts with 1 and 9 and moves to each descendant? We also increased HierarchyLevel by 1.

Because the results are linked by a UNION ALL, at this point we have the following results (step 1 + 2):

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
1               1           (null)      1
1               9           (null)      9
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14

Here is the tricky part, for each of the following iterations, recursive references to RecursiveCTE will only contain the last iteration result set, and not the accumulated set. This means that for the next iteration, RecursiveCTE will represent these rows:

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14
  1. Recursion N°2: Following the same recursive expression...

    UNION ALL
    
    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = E.ManagerID,
        HierarchyLevel = R.HierarchyLevel + 1,
        HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
    FROM
        RecursiveCTE AS R
        INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
    

And considering that in this step RecursiveCTE only holds rows with HierarchyLevel = 2, then the result if this JOIN is the following (level 3!):

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
3               4           2           1 -> 2 -> 4
3               5           2           1 -> 2 -> 5
3               7           6           1 -> 6 -> 7
3               8           6           1 -> 6 -> 8
3               10          3           1 -> 3 -> 10
3               11          3           1 -> 3 -> 11
3               15          13          9 -> 13 -> 15

This set (and only this!) will be used in the following recursive step as RecursiveCTE, and it will be added to the accumulated grand total, which is now:

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
1               1           (null)      1
1               9           (null)      9
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14
3               4           2           1 -> 2 -> 4
3               5           2           1 -> 2 -> 5
3               7           6           1 -> 6 -> 7
3               8           6           1 -> 6 -> 8
3               10          3           1 -> 3 -> 10
3               11          3           1 -> 3 -> 11
3               15          13          9 -> 13 -> 15
  1. Recursion N°3: Starting with level 3s in our working set, the result of the join is:

    HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
    4               12          10          1 -> 3 -> 10 -> 12
    

This becomes our working set for next recursive step.

  1. Recursion N°4: Starting with the only row level 4 from previous step, the result of the join yields no rows (no employee has EmployeeID 12 as ManagerID). Returning no rows marks the end of the iterations.

The final result set stands tall:

HierarchyLevel  EmployeeID  ManagerID   HierarchyRoute
1               1           (null)      1
1               9           (null)      9
2               2           1           1 -> 2
2               3           1           1 -> 3
2               6           1           1 -> 6
2               13          9           9 -> 13
2               14          9           9 -> 14
3               4           2           1 -> 2 -> 4
3               5           2           1 -> 2 -> 5
3               7           6           1 -> 6 -> 7
3               8           6           1 -> 6 -> 8
3               10          3           1 -> 3 -> 10
3               11          3           1 -> 3 -> 11
3               15          13          9 -> 13 -> 15
4               12          10          1 -> 3 -> 10 -> 12

Here is the complete fiddle and code:

CREATE TABLE Employee (EmployeeID INT, ManagerID INT)

INSERT INTO Employee (EmployeeID, ManagerID)
VALUES
  (1, NULL),
  (2, 1),
  (3, 1),
  (4, 2),
  (5, 2),
  (6, 1),
  (7, 6),
  (8, 6),
  (9, NULL),
  (10, 3),
  (11, 3),
  (12, 10),
  (13, 9),
  (14, 9),
  (15, 13)

WITH RecursiveCTE AS
(
    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = NULL, -- Always null by WHERE filter
        HierarchyLevel = 1,
        HierarchyRoute = CONVERT(VARCHAR(MAX), E.EmployeeID)
    FROM
        Employee AS E
    WHERE
        E.ManagerID IS NULL

    UNION ALL

    SELECT
        EmployeeID = E.EmployeeID,
        ManagerID = E.ManagerID,
        HierarchyLevel = R.HierarchyLevel + 1,
        HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
    FROM
        RecursiveCTE AS R
        INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
)
SELECT
    R.HierarchyLevel,
    R.EmployeeID,
    R.ManagerID,
    R.HierarchyRoute
FROM
    RecursiveCTE AS R
ORDER BY
    R.HierarchyLevel,
    R.EmployeeID



回答2:


If you have more than on top manager [ORGLEVEL] will always start on 1.

Without posting data cannot provide details.



来源:https://stackoverflow.com/questions/51176971/how-do-recursive-ctes-work-in-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!