Dijkstra algorithm optimization/caching

后端 未结 3 893

I have the following Dijkstra algorithm with 3 input variables (start, stop and time). It takes about 0.5-1s to complete. My hosting provider says it\'s using too muc

相关标签:
3条回答
  • 2020-12-20 04:16

    At a glance (you should really do some profiling, by the way), the culprit is the fact that you are executing a query for each graph node to find its neighbors:

    $result=mysql_query("SELECT route_id, next_stop FROM db_stop_times WHERE stop_id = $activeNode", $connection);
    

    If you have 1,700 nodes this is going to issue on the order of a thousand queries. Rather than hitting the database so often, cache these database results in something like memcached, and only fall back to the database on cache misses.

    0 讨论(0)
  • 2020-12-20 04:18

    Micro-optimisations, but don't do:

    for ($p = 0; $p < sizeof($nextStopArray); $p++) { 
       ...
    }
    

    calculate the sizeof($nextStopArray) before the loop, otherwise you're doing the count every iteration (and this value isn't being changed)

    $nextStopArraySize = sizeof($nextStopArray);
    for ($p = 0; $p < $nextStopArraySize; ++$p) { 
       ...
    }
    

    There's a couple of places where this should be changed.

    And if you're iterating several thousand times, ++$p is faster than $p++

    But profile the function... find out which parts are taking the longest to execute, and look to optimise those.

    EDIT

    Get rid of array_push_key as a function, simply execute it inline... it's costing you an unnecessary function call otherwise

    Build an array of all nodes from your database outside of the while(true) loop... retrieve all the data in a single SQL query and build a lookup array.

    Replacing

    for ($p = 0; $p < sizeof($nextStopArray); $p++) { 
    

    with

    $nextStopArraySize = sizeof($nextStopArray);
    $p = -1
    while (++$p < $nextStopArraySize) { 
       ...
    }
    

    may also prove faster still (just check that the logic does loop through the correct number of times).

    0 讨论(0)
  • 2020-12-20 04:23

    it's using too much resources

    Which resource? (CPU? Memory? Network bandwidth? I/O load on the database server?)

    while (true) {       
        $result=mysql_query("SELECT route_id, next_stop FROM db_stop_times WHERE stop_id = $activeNode", $connection);
    

    If I am reading this right you are doing a database call for every node in every pathfinding attempt. Each of these calls will block for a little while waiting for the response from the database. Even if you have a fast database, that's bound to take a couple milliseconds (unless the database is running on the same server as your code). So I'd venture the guess that most of your execution time is spent waiting for replies from the database.

    Moreover, should your database lack proper indexes, each query could do a full table scan ...

    The solution is straightforward: Load db_stop_times into memory at application startup, and use that in-memory representation when resolving neighbour nodes.

    Edit: Yes, an index on stop_id would be a proper index for this query. As for practical caching, I don't know PHP, but with something like Java (or C#, or C++, or even C) I'd use a representation of the form

    class Node {
        Link[] links;
    }
    
    class Link {
        int time;
        Node destination;
    }
    

    that would be a bit faster than memcached, but assumes you can comfortably fit the entire table in main memory. If you can't do that, I'd use a caching system like memcached.

    0 讨论(0)
提交回复
热议问题