问题
Can anyone help me understand how this recursive CTE works?
WITH
RECURSIVECTE (EMPID, FULLNAME, MANAGERID, [ORGLEVEL]) AS
(SELECT EMPID,
FULLNAME,
MANAGERID,
1
FROM RECURSIVETBL
WHERE MANAGERID IS NULL
UNION ALL
SELECT A.EMPID,
A.FULLNAME,
A.MANAGERID,
B.[ORGLEVEL] + 1
FROM RECURSIVETBL A
JOIN RECURSIVECTE B ON A.MANAGERID = B.EMPID)
SELECT *
FROM RECURSIVECTE;
回答1:
Recursive CTEs in SQL Server have 2 parts:
The Anchor: Is the starting point of your recursion. It's a set that will be further expanded by recursive joins.
SELECT
EMPID,
FULLNAME,
MANAGERID,
1 AS ORGLEVEL
FROM
RECURSIVETBL
WHERE
MANAGERID IS NULL
It seems that it's fetching all employees that don't have any managers (would be the top bosses, or roots from tree relationships).
The recursion: Linked with a UNION ALL
, this set has to reference the declaring CTE (thus making it recursive). Think of it as how will you expand the result of the anchor with the next level.
UNION ALL
SELECT
A.EMPID,
A.FULLNAME,
A.MANAGERID,
B.[ORGLEVEL] + 1
FROM
RECURSIVETBL A
JOIN RECURSIVECTE B -- Notice that we are referencing "RECURSIVECTE" which is the CTE we are declaring
ON A.MANAGERID = B.EMPID
On this example, we are fetching (on a first iteration) the anchor result set (all employees with no managers) and joining them with RECURSIVETBL
through MANAGERID
, so A.EMPID
will hold the employee of the previously selected manager. This joining goes on and on as long as each last result set can generate new rows.
There are a few limitations on what you can put on the recursive part (no grouping or another nested recursion for example). Also, as it's preceeded with a UNION ALL
, it's rules also apply (amount of columns and data types must match).
About the ORGLEVEL, it starts with the anchor set at 1 (it's hard-coded there). When it's further expanded on the recursion set, it fetches the previous set (the anchor, on the first iteration) and it adds 1, as it's expression is B.[ORGLEVEL] + 1
with B
being the previous set. This means that it starts with 1 (the top bosses) and it keeps adding 1 for each descendant, thus representing all the levels of the organization.
When you find an employee at ORGLEVEL = 3
means that he has 2 managers over him.
Step by step with working example
Let's follow this example:
EmployeeID ManagerID
1 NULL
2 1
3 1
4 2
5 2
6 1
7 6
8 6
9 NULL
10 3
11 3
12 10
13 9
14 9
15 13
Anchor: Employees without managers (
ManagerID IS NULL
). This will start with all the top badass of your company. It's crucial to note that if the anchor set is empty, then the whole recursive CTE will be empty, as there is no starting point and no recursive set to join to.SELECT EmployeeID = E.EmployeeID, ManagerID = NULL, -- Always null by WHERE filter HierarchyLevel = 1, HierarchyRoute = CONVERT(VARCHAR(MAX), E.EmployeeID) FROM Employee AS E WHERE E.ManagerID IS NULL
Which are these:
EmployeeID ManagerID HierarchyLevel HierarchyRoute
1 (null) 1 1
9 (null) 1 9
Recursion N°1: Using this
UNION ALL
recursion:UNION ALL SELECT EmployeeID = E.EmployeeID, ManagerID = E.ManagerID, HierarchyLevel = R.HierarchyLevel + 1, HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID) FROM RecursiveCTE AS R INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
For this INNER JOIN
, RecursiveCTE
has 2 rows (the anchor set), with employees ID 1
and 9
. So this JOIN
will actually return this result.
HierarchyLevel EmployeeID ManagerID HierarchyRoute
2 2 1 1 -> 2
2 3 1 1 -> 3
2 6 1 1 -> 6
2 13 9 9 -> 13
2 14 9 9 -> 14
See how the HierarchyRoute
starts with 1 and 9 and moves to each descendant? We also increased HierarchyLevel
by 1.
Because the results are linked by a UNION ALL
, at this point we have the following results (step 1 + 2):
HierarchyLevel EmployeeID ManagerID HierarchyRoute
1 1 (null) 1
1 9 (null) 9
2 2 1 1 -> 2
2 3 1 1 -> 3
2 6 1 1 -> 6
2 13 9 9 -> 13
2 14 9 9 -> 14
Here is the tricky part, for each of the following iterations, recursive references to RecursiveCTE
will only contain the last iteration result set, and not the accumulated set. This means that for the next iteration, RecursiveCTE
will represent these rows:
HierarchyLevel EmployeeID ManagerID HierarchyRoute
2 2 1 1 -> 2
2 3 1 1 -> 3
2 6 1 1 -> 6
2 13 9 9 -> 13
2 14 9 9 -> 14
Recursion N°2: Following the same recursive expression...
UNION ALL SELECT EmployeeID = E.EmployeeID, ManagerID = E.ManagerID, HierarchyLevel = R.HierarchyLevel + 1, HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID) FROM RecursiveCTE AS R INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
And considering that in this step RecursiveCTE
only holds rows with HierarchyLevel = 2
, then the result if this JOIN is the following (level 3!):
HierarchyLevel EmployeeID ManagerID HierarchyRoute
3 4 2 1 -> 2 -> 4
3 5 2 1 -> 2 -> 5
3 7 6 1 -> 6 -> 7
3 8 6 1 -> 6 -> 8
3 10 3 1 -> 3 -> 10
3 11 3 1 -> 3 -> 11
3 15 13 9 -> 13 -> 15
This set (and only this!) will be used in the following recursive step as RecursiveCTE
, and it will be added to the accumulated grand total, which is now:
HierarchyLevel EmployeeID ManagerID HierarchyRoute
1 1 (null) 1
1 9 (null) 9
2 2 1 1 -> 2
2 3 1 1 -> 3
2 6 1 1 -> 6
2 13 9 9 -> 13
2 14 9 9 -> 14
3 4 2 1 -> 2 -> 4
3 5 2 1 -> 2 -> 5
3 7 6 1 -> 6 -> 7
3 8 6 1 -> 6 -> 8
3 10 3 1 -> 3 -> 10
3 11 3 1 -> 3 -> 11
3 15 13 9 -> 13 -> 15
Recursion N°3: Starting with level 3s in our working set, the result of the join is:
HierarchyLevel EmployeeID ManagerID HierarchyRoute 4 12 10 1 -> 3 -> 10 -> 12
This becomes our working set for next recursive step.
- Recursion N°4: Starting with the only row level 4 from previous step, the result of the join yields no rows (no employee has EmployeeID 12 as ManagerID). Returning no rows marks the end of the iterations.
The final result set stands tall:
HierarchyLevel EmployeeID ManagerID HierarchyRoute
1 1 (null) 1
1 9 (null) 9
2 2 1 1 -> 2
2 3 1 1 -> 3
2 6 1 1 -> 6
2 13 9 9 -> 13
2 14 9 9 -> 14
3 4 2 1 -> 2 -> 4
3 5 2 1 -> 2 -> 5
3 7 6 1 -> 6 -> 7
3 8 6 1 -> 6 -> 8
3 10 3 1 -> 3 -> 10
3 11 3 1 -> 3 -> 11
3 15 13 9 -> 13 -> 15
4 12 10 1 -> 3 -> 10 -> 12
Here is the complete fiddle and code:
CREATE TABLE Employee (EmployeeID INT, ManagerID INT)
INSERT INTO Employee (EmployeeID, ManagerID)
VALUES
(1, NULL),
(2, 1),
(3, 1),
(4, 2),
(5, 2),
(6, 1),
(7, 6),
(8, 6),
(9, NULL),
(10, 3),
(11, 3),
(12, 10),
(13, 9),
(14, 9),
(15, 13)
WITH RecursiveCTE AS
(
SELECT
EmployeeID = E.EmployeeID,
ManagerID = NULL, -- Always null by WHERE filter
HierarchyLevel = 1,
HierarchyRoute = CONVERT(VARCHAR(MAX), E.EmployeeID)
FROM
Employee AS E
WHERE
E.ManagerID IS NULL
UNION ALL
SELECT
EmployeeID = E.EmployeeID,
ManagerID = E.ManagerID,
HierarchyLevel = R.HierarchyLevel + 1,
HierarchyRoute = R.HierarchyRoute + ' -> ' + CONVERT(VARCHAR(10), E.EmployeeID)
FROM
RecursiveCTE AS R
INNER JOIN Employee AS E ON R.EmployeeID = E.ManagerID
)
SELECT
R.HierarchyLevel,
R.EmployeeID,
R.ManagerID,
R.HierarchyRoute
FROM
RecursiveCTE AS R
ORDER BY
R.HierarchyLevel,
R.EmployeeID
回答2:
If you have more than on top manager [ORGLEVEL] will always start on 1.
Without posting data cannot provide details.
来源:https://stackoverflow.com/questions/51176971/how-do-recursive-ctes-work-in-sql-server