问题
My work makes extensive use of the algorithm by Migliore, Martorana and Sciortino for finding all possible simple paths, i.e. ones in which no node is encountered more than once, in a graph as described in: An Algorithm to find All Paths between Two Nodes in a Graph. (Although this algorithm is essentially a depth-first search and intuitively recursive in nature, the authors also present a non-recursive, stack-based implementation.) I'd like to know if such an algorithm can be implemented on the GPU. At the moment I'm struggling to see any real parallelism in this problem. For example, the cost of monitoring and dispatching threads might make the a cooperative graph search (by hardware threads) prohibitive. Alternatively, a divide and conquer strategy could work if the graph is partitioned and assigned to individual hardware threads for searching. However, one would have to figure out how to (1) partition the graph (2) formulate the subtasks and (3) combine the results of the searches on the partitions.
回答1:
Bit rusty on this. How about Dijkstra?
Boolean[] visited; // [node] = true;
Boolean[][] connected; // [node][i] = node
Vector<Vector<Integer>>[] path; // this should suck
Integer startNode;
Integer endNode;
Queue queue0; //for thread 0
Queue queue1; //for thread 1
while (queue0.hasNext()) {
Integer node = queue.getNext();
if visited[node] {
continue;
} else {
visited[node] = true;
}
for (nextNode: connected[node]) {
for (i) {
path[nextNode].append(path[node][i].clone().append(node));
}
if (nextNode%2 == 0) { queue0.add(nextNode); }
if (nextNode%2 == 1) { queue1.add(nextNode); }
}
}
path[endNode][i] // ith path to endNode from startNode
partitioning: came from node % 2
subtasks: find place to go from node
combining: you have shared memory, right?
回答2:
I don't think that your problem can be easily ported on a GPU in a way that it would perform faster. GPU programs that utilise most GPU power:
- Consist of thousants of threads, but the number of them is constant. No spawning of new threads or killing previous ones.
- Prefer coalesced memory access. If neighbouring threads access completely different regions of memory (and usually graph algorithms do) it will be slow.
- Don't like recurssion and stacks. Newest NVIDIA Fermi cards do support function calls and threads can have a stack, but because of high thread count, the stacks are very short (or consume a lot of memory).
I don't say that there is no efficient GPU algorithm, but I believe that there is no straightforward way to transform existing algorithms into an efficient code.
来源:https://stackoverflow.com/questions/4601364/gpu-based-search-for-all-possible-paths-between-two-nodes-on-a-graph