I have a Process table in SQL Server like this:
workflowXML
column has values like this:
sample1 (ProcessID=1)
I was curious to solve this with a set-based recursive approach as I do not like procedural coding in T-SQL. Hope you like it :-)
DECLARE @process TABLE(ID INT IDENTITY, workflowXML XML);
INSERT INTO @process(workflowXML) VALUES
('<process>
<Event type="start" id="StartEvent_1" name="Start">
<outgoing>SequenceFlow_0z7u86p</outgoing>
<outgoing>SequenceFlow_1onkt3z</outgoing>
</Event>
<task type="" id="Task_0a7vu1x" name="D">
<incoming>SequenceFlow_108ajnm</incoming>
<incoming>SequenceFlow_1onkt3z</incoming>
<outgoing>SequenceFlow_01clcmz</outgoing>
</task>
<task type="goal" id="Task_00ijt4n" name="B">
<incoming>SequenceFlow_17q1ecq</incoming>
<incoming>SequenceFlow_0q9j3et</incoming>
<outgoing>SequenceFlow_1ygvv8b</outgoing>
<outgoing>SequenceFlow_02glv1g</outgoing>
</task>
<task type="" id="Task_1rnuz4y" name="A">
<incoming>SequenceFlow_1ygvv8b</incoming>
<incoming>SequenceFlow_0z7u86p</incoming>
<outgoing>SequenceFlow_108ajnm</outgoing>
<outgoing>SequenceFlow_17q1ecq</outgoing>
<outgoing>SequenceFlow_075iuj9</outgoing>
</task>
<task type="goal" id="Task_1d4ykor" name="E">
<incoming>SequenceFlow_01clcmz</incoming>
<incoming>SequenceFlow_075iuj9</incoming>
<incoming>SequenceFlow_1djp3tu</incoming>
<outgoing>SequenceFlow_0q9j3et</outgoing>
</task>
<task type="goal" id="Task_1sembw4" name="C">
<incoming>SequenceFlow_02glv1g</incoming>
<outgoing>SequenceFlow_1djp3tu</outgoing>
</task>
</process>')
,('<process id="Process_1" isExecutable="false">
<Event type="start" id="StartEvent_0bivq0x" name="Start">
<outgoing>SequenceFlow_0q5ik20</outgoing>
<outgoing>SequenceFlow_147xk2x</outgoing>
</Event>
<task type="" id="Task_141buye" name="A">
<incoming>SequenceFlow_0q5ik20</incoming>
<incoming>SequenceFlow_0wg37hn</incoming>
<outgoing>SequenceFlow_1pvpyhe</outgoing>
<outgoing>SequenceFlow_10is4pe</outgoing>
</task>
<task type="" id="Task_1n3p00i" name="C">
<incoming>SequenceFlow_147xk2x</incoming>
<incoming>SequenceFlow_10is4pe</incoming>
<outgoing>SequenceFlow_18ks1jr</outgoing>
<outgoing>SequenceFlow_08gxini</outgoing>
</task>
<task type="goal" id="Task_0olxqpp" name="B">
<incoming>SequenceFlow_1pvpyhe</incoming>
<outgoing>SequenceFlow_03eekq0</outgoing>
</task>
<task type="goal" id="Task_0zjgfkf" name="D">
<incoming>SequenceFlow_18ks1jr</incoming>
<incoming>SequenceFlow_03eekq0</incoming>
<outgoing>SequenceFlow_0wg37hn</outgoing>
</task>
<task type="" id="Task_1q71efy" name="E">
<incoming>SequenceFlow_08gxini</incoming>
</task>
</process>');
--the query
WITH DerivedTable AS
(
SELECT prTbl.ID AS tblID
,nd.value('local-name(.)','nvarchar(max)') AS NodeName
,nd.value('@type','nvarchar(max)') AS [Type]
,nd.value('@id','nvarchar(max)') AS Id
,nd.value('@name','nvarchar(max)') AS [Name]
,nd.query('.') AS Task
FROM @process AS prTbl
CROSS APPLY prTbl.workflowXML.nodes('process') AS A(pr)
CROSS APPLY pr.nodes('*') AS B(nd)
)
,AllIncoming AS
(
SELECT tblId
,NodeName
,[Type]
,Id
,[Name]
,i.value('.','nvarchar(max)') AS [Target]
FROM DerivedTable
CROSS APPLY Task.nodes('task/incoming') AS A(i)
WHERE NodeName='task'
)
,recCTE AS
(
SELECT tblID,NodeName,[Type],Id,[Name],Task,1 AS Step,' | ' +CAST(Id AS NVARCHAR(MAX)) AS NodePath
FROM DerivedTable
WHERE [Type]='start'
UNION ALL
SELECT nxt.tblID,nxt.NodeName,nxt.[Type],nxt.Id,nxt.[Name],nxt.Task,r.Step+1,r.NodePath + ' | ' + nxt.Id
FROM recCTE AS r
INNER JOIN DerivedTable AS nxt ON nxt.Id IN(SELECT x.Id
FROM AllIncoming AS x
WHERE x.[Target] IN (SELECT o.value('.','nvarchar(max)')
FROM r.Task.nodes('*/outgoing') AS A(o)
)
)
WHERE r.[Type]<>'goal'
AND r.NodePath NOT LIKE '%| ' + nxt.Id + '%'
AND r.Step<=10 --add an appropriate depth limit to avoid recusion-depth error
)
SELECT t.tblID
,t.[Name]
,t.NodePath
,t.Step
,t.Id
FROM recCTE AS t
WHERE t.[Type]='goal'
AND t.Step<=ISNULL((SELECT MIN(x.Step) FROM recCTE AS x WHERE x.tblID=t.tblID AND x.[Type]='goal' AND x.NodeName='task'),999)
ORDER BY t.tblID,t.Step
The result
tblID Name NodePath Step Id
1 B | StartEvent_1 | Task_1rnuz4y | Task_00ijt4n 3 Task_00ijt4n
1 E | StartEvent_1 | Task_1rnuz4y | Task_1d4ykor 3 Task_1d4ykor
1 E | StartEvent_1 | Task_0a7vu1x | Task_1d4ykor 3 Task_1d4ykor
2 D | StartEvent_0bivq0x | Task_1n3p00i | Task_0zjgfkf 3 Task_0zjgfkf
2 B | StartEvent_0bivq0x | Task_141buye | Task_0olxqpp 3 Task_0olxqpp
You find more than two results for tblID=1 as there are differing paths leading to the same goal node.
My first attempt finds the shortest path to a goal. Any goal, which is reached with a longer path, was filtered. This is easy to change:
Let the final WHERE
find the shortest path to a specific node by adding the Id:
WHERE t.[Type]='goal'
AND t.Step<=ISNULL((SELECT MIN(x.Step)
FROM recCTE AS x
WHERE x.tblID=t.tblID
AND x.Id=t.Id
AND x.[Type]='goal' AND x.NodeName='task'),999)
This returns for all three examples:
+-------+------+----------------------------------------------------------------------------+------+--------------+
| tblID | Name | NodePath | Step | Id |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 1 | B | | StartEvent_1 | Task_1rnuz4y | Task_00ijt4n | 3 | Task_00ijt4n |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 1 | E | | StartEvent_1 | Task_1rnuz4y | Task_1d4ykor | 3 | Task_1d4ykor |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 1 | E | | StartEvent_1 | Task_0a7vu1x | Task_1d4ykor | 3 | Task_1d4ykor |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 2 | B | | StartEvent_0bivq0x | Task_141buye | Task_0olxqpp | 3 | Task_0olxqpp |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 2 | D | | StartEvent_0bivq0x | Task_1n3p00i | Task_0zjgfkf | 3 | Task_0zjgfkf |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 3 | E | | StartEvent_1 | Task_1jixk79 | Task_0wtjftd | 3 | Task_0wtjftd |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 3 | C | | StartEvent_1 | Task_1jixk79 | Task_0xwvhuo | Task_032o8jx | 4 | Task_032o8jx |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 3 | H | | StartEvent_1 | Task_1jixk79 | Task_0xwvhuo | Task_0fsibap | 4 | Task_0fsibap |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 3 | G | | StartEvent_1 | Task_1jixk79 | Task_0xwvhuo | Task_0c85e6p | Task_0qsvlob | 5 | Task_0qsvlob |
+-------+------+----------------------------------------------------------------------------+------+--------------+
| 3 | G | | StartEvent_1 | Task_1jixk79 | Task_164ihwt | Task_0c85e6p | Task_0qsvlob | 5 | Task_0qsvlob |
+-------+------+----------------------------------------------------------------------------+------+--------------+