问题
I am working with a (number of) directed graphs with no cycles in them, and I have the need to find all simple paths between any two nodes. In general I wouldn't worry about the execution time, but I have to do this for very many nodes during very many timesteps - I am dealing with a time-based simulation.
I had tried in the past the facilities offered by NetworkX but in general I found them slower than my approach. Not sure if anything has changed lately.
I have implemented this recursive function:
import timeit
def all_simple_paths(adjlist, start, end, path):
path = path + [start]
if start == end:
return [path]
paths = []
for child in adjlist[start]:
if child not in path:
child_paths = all_simple_paths(adjlist, child, end, path)
paths.extend(child_paths)
return paths
fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())
number = 1000
stmnt = 'all_simple_paths(adjlist, 166, 180, [])'
setup = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print 'Elapsed: %0.2f ms'%(1000*elapsed)
On my computer, I get an average of 1.5 ms per iteration. I know this is a small number, but I have to do this operation very many times.
In case you're interested, I have uploaded a small file containing the adjacency list here:
adjlist
I am using adjacency lists as inputs, coming from a NetworkX DiGraph representation.
Any suggestion for improvements of the algorithm (i.e., does it have to be recursive?) or other approaches I may try are more than welcome.
Thank you.
Andrea.
回答1:
You can save time without change the algorithm logic by caching result of shared sub-problems here.
For example, calling all_simple_paths(adjlist, 'A', 'D', [])
in following graph will compute all_simple_paths(adjlist, 'D', 'E', [])
multiple times:
Python has a built-in decorator lru_cache
for this task. It uses hash to memorize the parameters so you will need to change adjList
and path
to tuple
since list
is not hashable.
import timeit
import functools
@functools.lru_cache()
def all_simple_paths(adjlist, start, end, path):
path = path + (start,)
if start == end:
return [path]
paths = []
for child in adjlist[start]:
if child not in path:
child_paths = all_simple_paths(tuple(adjlist), child, end, path)
paths.extend(child_paths)
return paths
fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())
# you can also change your data format in txt
adjlist = tuple(tuple(pair)for pair in adjlist)
number = 1000
stmnt = 'all_simple_paths(adjlist, 166, 180, ())'
setup = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print('Elapsed: %0.2f ms'%(1000*elapsed))
Running time on my machine:
- original: 0.86ms
- with cache: 0.01ms
And this method should only work when there's a lot shared sub-problems.
来源:https://stackoverflow.com/questions/46500318/python-all-simple-paths-in-a-directed-graph