Python: all simple paths in a directed graph

徘徊边缘 提交于 2021-02-08 08:31:09

问题


I am working with a (number of) directed graphs with no cycles in them, and I have the need to find all simple paths between any two nodes. In general I wouldn't worry about the execution time, but I have to do this for very many nodes during very many timesteps - I am dealing with a time-based simulation.

I had tried in the past the facilities offered by NetworkX but in general I found them slower than my approach. Not sure if anything has changed lately.

I have implemented this recursive function:

import timeit

def all_simple_paths(adjlist, start, end, path):

    path = path + [start]

    if start == end:
        return [path]

    paths = []

    for child in adjlist[start]:

        if child not in path:

            child_paths = all_simple_paths(adjlist, child, end, path)
            paths.extend(child_paths)

    return paths


fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())

number = 1000
stmnt  = 'all_simple_paths(adjlist, 166, 180, [])'
setup  = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print 'Elapsed: %0.2f ms'%(1000*elapsed)

On my computer, I get an average of 1.5 ms per iteration. I know this is a small number, but I have to do this operation very many times.

In case you're interested, I have uploaded a small file containing the adjacency list here:

adjlist

I am using adjacency lists as inputs, coming from a NetworkX DiGraph representation.

Any suggestion for improvements of the algorithm (i.e., does it have to be recursive?) or other approaches I may try are more than welcome.

Thank you.

Andrea.


回答1:


You can save time without change the algorithm logic by caching result of shared sub-problems here.

For example, calling all_simple_paths(adjlist, 'A', 'D', []) in following graph will compute all_simple_paths(adjlist, 'D', 'E', []) multiple times:

Python has a built-in decorator lru_cache for this task. It uses hash to memorize the parameters so you will need to change adjList and path to tuple since list is not hashable.

import timeit
import functools

@functools.lru_cache()
def all_simple_paths(adjlist, start, end, path):

    path = path + (start,)

    if start == end:
        return [path]

    paths = []

    for child in adjlist[start]:

        if child not in path:

            child_paths = all_simple_paths(tuple(adjlist), child, end, path)
            paths.extend(child_paths)

    return paths


fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())

# you can also change your data format in txt
adjlist = tuple(tuple(pair)for pair in adjlist)

number = 1000
stmnt  = 'all_simple_paths(adjlist, 166, 180, ())'
setup  = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print('Elapsed: %0.2f ms'%(1000*elapsed))

Running time on my machine:
- original: 0.86ms
- with cache: 0.01ms

And this method should only work when there's a lot shared sub-problems.



来源:https://stackoverflow.com/questions/46500318/python-all-simple-paths-in-a-directed-graph

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!