Algorithm for topological sorting if cycles exist

北城余情 提交于 2019-12-02 18:27:45

This is the NP-hard (and APX-hard) problem known as minimum feedback vertex set. An approximation algorithm due to Demetrescu and Finocchi (pdf, Combinatorial Algorithms for Feedback Problems in Directed Graphs (2003)") works well in practice when there are no long simple cycles, as I would expect for your application.

Here is how to do it in Python:

from collections import defaultdict

def topological_sort(dependency_pairs):
    'Sort values subject to dependency constraints'
    num_heads = defaultdict(int)   # num arrows pointing in
    tails = defaultdict(list)      # list of arrows going out
    for h, t in dependency_pairs:
        num_heads[t] += 1
        tails[h].append(t)

    ordered = [h for h in tails if h not in num_heads]
    for h in ordered:
        for t in tails[h]:
            num_heads[t] -= 1
            if not num_heads[t]:
                ordered.append(t)
    cyclic = [n for n, heads in num_heads.iteritems() if heads]
    return ordered, cyclic

def is_toposorted(ordered, dependency_pairs):
    '''Return True if all dependencies have been honored.
       Raise KeyError for missing tasks.
    '''
    rank = {t: i for i, t in enumerate(ordered)}
    return all(rank[h] < rank[t] for h, t in dependency_pairs)

print topological_sort('aa'.split())
ordered, cyclic = topological_sort('ah bg cf ch di ed fb fg hd he ib'.split())
print ordered, cyclic
print is_toposorted(ordered, 'ah bg cf ch di ed fb fg hd he ib'.split())
print topological_sort('ah bg cf ch di ed fb fg hd he ib ba xx'.split())

The runtime is linearly proportional to the number of edges (dependency pairs).

The algorithm is organized around a lookup table called num_heads that keeps a count the number of predecessors (incoming arrows). In the ah bg cf ch di ed fb fg hd he ib example, the counts are:

node  number of incoming edges
----  ------------------------
 a       0
 b       2
 c       0
 d       2
 e       1
 f       1  
 g       2
 h       2
 i       1

The algorithm works by "visting" nodes with no predecessors. For example, nodes a and c have no incoming edges, so they are visited first.

Visiting means that the nodes are output and removed from the graph. When a node is visited, we loop over its successors and decrement their incoming count by one.

For example, in visiting node a, we go to its successor h to decrement its incoming count by one (so that h 2 becomes h 1.

Likewise, when visiting node c, we loop over its successors f and h, decrementing their counts by one (so that f 1 becomes f 0 and h 1 becomes h 0).

The nodes f and h no longer have incoming edges, so we repeat the process of outputting them and removing them from the graph until all the nodes have been visited. In the example, the visitation order (the topological sort is):

a c f h e d i b g

If num_heads ever arrives at a state when there are no nodes without incoming edges, then it means there is a cycle that cannot be topologically sorted and the algorithm exits.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!