Fast algorithm for counting the number of acyclic paths on a directed graph

喜夏-厌秋 提交于 2019-12-04 02:24:04
dfb

This is #P-complete, it seems. (ref http://www.maths.uq.edu.au/~kroese/ps/robkro_rev.pdf). The link has an approximation

If you can relax the simple path requirement, you can efficiently count the number of paths using a modified version of Floyd-Warshall or graph exponentiation as well. See All pairs all paths on a graph

As mentioned by spinning_plate, this problem is #P-complete so start looking for your aproximations :). I really like the #P-completeness proof for this problem, so I'd think it would be nice to share it:

Let N be the number of paths (starting at s) in the graph and p_k be the number of paths of length k. We have:

N = p_1 + p_2 + ... + p_n

Now build a second graph by changing every edge to a pair of paralel edges.For each path of length k there will now be k^2 paths so:

N_2 = p_1*2 + p_2*4 + ... + p_n*(2^n)

Repeating this process, but with i edges instead of 2, up n, would give us a linear system (with a Vandermonde matrix) allowing us to find p_1, ..., p_n.

N_i = p_1*i + p_2*(i^2) + ...

Therefore, finding the number of paths in the graph is just as hard as finding the number of paths of a certain length. In particular, p_n is the number of Hamiltonian Paths (starting at s), a bona-fide #P-complete problem.


I havent done the math I'd also guess that a similar process should be able to prove that just calculating average length is also hard.


Note: most times this problem is discussed the paths start from a single edge and stop wherever. This is the opposite from your problem, but you they should be equivalent by just reversing all the edges.

Importance of Problem Statement

It is unclear what is being counted.

  1. Is the starting node set all nodes for which there is at least one outgoing edge, or is there a particular starting node criteria?
  2. Is the the ending node set the set of all nodes for which there are zero outgoing edges, or can any node for which there is at least one incoming edge be a possible ending node?

Define your problem so that there are no ambiguities.

Estimation

Estimations can be off by orders of magnitude when designed for randomly constructed directed graphs and the graph is very statistically skewed or systematic in its construction. This is typical of all estimation processes, but particularly pronounced in graphs because of their exponential pattern complexity potential.

Two Optimizing Points

The std::bitset model will be slower than bool values for most processor architectures because of the instruction set mechanics of testing a bit at a particular bit offset. The bitset is more useful when memory footprint, not speed is the critical factor.

Eliminating cases or reducing via deductions is important. For instance, if there are nodes for which there is only one outgoing edge, one can calculate the number of paths without it and add to the number of paths in the sub-graph the number of paths from the node from which it points.

Resorting to Clusters

The problem can be executed on a cluster by distributing according to starting node. Some problems simply require super-computing. If you have 1,000,000 starting nodes and 10 processors, you can place 100,000 starting node cases on each processor. The above case eliminations and reductions should be done prior to distributing cases.

A Typical Depth First Recursion and How to Optimize It

Here is a small program that provides a basic depth first, acyclic traversal from any node to any node, which can be altered, placed in a loop, or distributed. The list can be placed into a static native array by using a template with a size as one parameter if the maximum data set size is known, which reduces iteration and indexing times.

#include <iostream>
#include <list>

class DirectedGraph {

    private:
        int miNodes;
        std::list<int> * mnpEdges;
        bool * mpVisitedFlags;

    private:
        void initAlreadyVisited() {
            for (int i = 0; i < miNodes; ++ i)
                mpVisitedFlags[i] = false;
        }

        void recurse(int iCurrent, int iDestination,
               int path[], int index,
               std::list<std::list<int> *> * pnai) {

            mpVisitedFlags[iCurrent] = true;
            path[index ++] = iCurrent;

            if (iCurrent == iDestination) {
                auto pni = new std::list<int>;
                for (int i = 0; i < index; ++ i)
                    pni->push_back(path[i]);
                pnai->push_back(pni);

            } else {
                auto it = mnpEdges[iCurrent].begin();
                auto itBeyond = mnpEdges[iCurrent].end();
                while (it != itBeyond) {
                    if (! mpVisitedFlags[* it])
                        recurse(* it, iDestination,
                                path, index, pnai);
                    ++ it;
                }
            }

            -- index;
            mpVisitedFlags[iCurrent] = false;
        } 

    public:
        DirectedGraph(int iNodes) {
            miNodes = iNodes;
            mnpEdges = new std::list<int>[iNodes];
            mpVisitedFlags = new bool[iNodes];
        }

        ~DirectedGraph() {
            delete mpVisitedFlags;
        }

        void addEdge(int u, int v) {
            mnpEdges[u].push_back(v);
        }

        std::list<std::list<int> *> * findPaths(int iStart,
                int iDestination) {
            initAlreadyVisited();
            auto path = new int[miNodes];
            auto pnpi = new std::list<std::list<int> *>();
            recurse(iStart, iDestination, path, 0, pnpi);
            delete path;
            return pnpi;
        }
};

int main() {

    DirectedGraph dg(5);

    dg.addEdge(0, 1);
    dg.addEdge(0, 2);
    dg.addEdge(0, 3);
    dg.addEdge(1, 3);
    dg.addEdge(1, 4);
    dg.addEdge(2, 0);
    dg.addEdge(2, 1);
    dg.addEdge(4, 1);
    dg.addEdge(4, 3);

    int startingNode = 0;
    int destinationNode = 1;

    auto pnai = dg.findPaths(startingNode, destinationNode);

    std::cout
            << "Unique paths from "
            << startingNode
            << " to "
            << destinationNode
            << std::endl
            << std::endl;

    bool bFirst;
    std::list<int> * pi;
    auto it = pnai->begin();
    auto itBeyond = pnai->end();
    std::list<int>::iterator itInner;
    std::list<int>::iterator itInnerBeyond;
    while (it != itBeyond) {
        bFirst = true;
        pi = * it ++;
        itInner = pi->begin();
        itInnerBeyond = pi->end();
        while (itInner != itInnerBeyond) {
            if (bFirst)
                bFirst = false;
            else
                std::cout << ' ';
            std::cout << (* itInner ++);
        }
        std::cout << std::endl;
        delete pi;
    }

    delete pnai;

    return 0;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!