Dictionary best data structure for train routes?

后端 未结 6 1121
北海茫月
北海茫月 2020-12-02 01:08

So I\'ve been tasked with essentially reading in a file (notepad file) that has a bunch of train stops and the time it takes to get from one stop to another. For example it

相关标签:
6条回答
  • 2020-12-02 01:28

    A dict is perfectly acceptable in this situation. In fact, if a database is unavailable to you and you're interested in reusing this dictionary in the future, you could pickle the object and use it in another script:

    import pickle
    
    output = open('my_pickled_dict', 'wb')
    pickle.dumb(my_dict, output)
    output.close
    

    And then in your next script:

    import pickle
    
    my_pickled_file = open('my_pickled_dict', 'rb')
    my_dict = pickle.load(my_pickled_file)
    
    0 讨论(0)
  • 2020-12-02 01:29

    A Dictionary is a set of values stored by the hashes of a set keys. The advantage of using a dictionary is the time needed to locate a specific record regardless of size is fixed (often called O(1)). Alternatively if you used a list, the time needed to locate a record would be equal to the time it takes to read each element (often called O(n) where n equals the number of elements in the list).

    0 讨论(0)
  • 2020-12-02 01:30

    I will go against the grain -- and say that a straight flat dict is not the best for this.

    Let's say you have 100 stops and multiple routes that are non-alphabetical and non-numeric. Think the Paris subway:

    Paris Subway

    Now try and use a straight Python dict to calculate the time between FDR and La Fourche? That involves two or more different routes and multiple options.

    A tree or some form of graph is a better structure. A dict is fabulous for a 1 to 1 mapping; tree are better for a rich description of nodes that relate to each other. You would then use something like Dijkstra's Algorithm to navigate it.

    Since a nested dict of dicts or dict of lists IS a graph, it is easy to come up with a recursive example:

    def find_all_paths(graph, start, end, path=[]):
            path = path + [start]
            if start == end:
                return [path]
            if start not in graph:
                return []
            paths = []
            for node in graph[start]:
                if node not in path:
                    newpaths = find_all_paths(graph, node, end, path)
                    for newpath in newpaths:
                        paths.append(newpath)
            return paths       
    
    def min_path(graph, start, end):
        paths=find_all_paths(graph,start,end)
        mt=10**99
        mpath=[]
        print '\tAll paths:',paths
        for path in paths:
            t=sum(graph[i][j] for i,j in zip(path,path[1::]))
            print '\t\tevaluating:',path, t
            if t<mt: 
                mt=t
                mpath=path
    
        e1=' '.join('{}->{}:{}'.format(i,j,graph[i][j]) for i,j in zip(mpath,mpath[1::]))
        e2=str(sum(graph[i][j] for i,j in zip(mpath,mpath[1::])))
        print 'Best path: '+e1+'   Total: '+e2+'\n'  
    
    if __name__ == "__main__":
        graph = {'A': {'B':5, 'C':4},
                 'B': {'C':3, 'D':10},
                 'C': {'D':12},
                 'D': {'C':5, 'E':9},
                 'E': {'F':8},
                 'F': {'C':7}}
        min_path(graph,'A','E')
        min_path(graph,'A','D')
        min_path(graph,'A','F')
    

    Prints:

        All paths: [['A', 'C', 'D', 'E'], ['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'D', 'E']]
            evaluating: ['A', 'C', 'D', 'E'] 25
            evaluating: ['A', 'B', 'C', 'D', 'E'] 29
            evaluating: ['A', 'B', 'D', 'E'] 24
    Best path: A->B:5 B->D:10 D->E:9   Total: 24
    
        All paths: [['A', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'D']]
            evaluating: ['A', 'C', 'D'] 16
            evaluating: ['A', 'B', 'C', 'D'] 20
            evaluating: ['A', 'B', 'D'] 15
    Best path: A->B:5 B->D:10   Total: 15
    
        All paths: [['A', 'C', 'D', 'E', 'F'], ['A', 'B', 'C', 'D', 'E', 'F'], ['A', 'B', 'D', 'E', 'F']]
            evaluating: ['A', 'C', 'D', 'E', 'F'] 33
            evaluating: ['A', 'B', 'C', 'D', 'E', 'F'] 37
            evaluating: ['A', 'B', 'D', 'E', 'F'] 32
    Best path: A->B:5 B->D:10 D->E:9 E->F:8   Total: 32
    
    0 讨论(0)
  • 2020-12-02 01:33

    A dictionary is entirely suitable for looking up the value for a particular stop. This doesn't store any information about the order of the stops, however, so you'd likely need a separate list for that anyway. I'm assuming these times are the delay between adjacent stops only, so if you need to calculate the total time to get between arbitrary pairs of stops you might actually find a list of 2-tuples more convenient than a list and dictionary:

    train_times = [("A", 0), ("B", 15), ("C", 12), ("D", 9)]
    

    Note: the first time will always presumably be zero because there's no previous stop to time from. Alternatively you could make the final time zero and store the value against the previous stop, but I've assumed the first case here.

    This allows you to calculate the total time from A to C quite simply:

    def get_total_time(start_stop, end_stop):
        total_time = None
        for stop, this_time in train_times:
            if total_time is None and stop == start_stop:
                total_time = 0
            if total_time is not None:
                total_time += this_time
                if stop == end_stop:
                    return total_time
        raise Exception("stop not found")
    
    print "Time from A to C: %d" % (get_total_time("A", "C"),)
    

    This could be made more efficient combining a list and a dictionary, but it doesn't make a lot of difference unless the list is very long (at least hundreds of stops).

    Also, things get more complicated once you introduce lots of train lines which link up with each other. In this case you could use any number of data structures - one simple example would be a dictionary where the key is the stop and the value is a list of the adjacent stops on all lines and the times to them. However, finding a route through this requires some slightly non-trivial (but pretty well known) algorithms which are beyond the scope of this question.

    In either case, if you need to look up these values quickly you could consider pre-calculating the time between all pairs of stops - this isknown as a the transitive closure of a graph (where "graph" in this sense is a network of nodes, not a line chart or similar).

    0 讨论(0)
  • 2020-12-02 01:34

    A dictionary could be good for this, yes. But if you need to keep track of what stop comes next to the other, you might need something else, like an ordered dict, or a small class for a Stop that defines the next one (and whatever other data you'd like to keep about the stop), or even a list of stops, as it keeps the order.

    Depending on what kind of access you need or what you want to do with this data, any of these could work for you.

    Good luck.

    0 讨论(0)
  • 2020-12-02 01:43

    A dictionary is suitable for doing this.

    It makes it easy for you to access the time (= value) for a given bus stop (= key).

    time_per_stop = {"Stop A": 15, "Stop B": 12, "Stop C": 9}
    

    Of course, in case you have a finite and small amount of bus stops, you could also just keep a list or tuple of stop times.

    time_per_stop_list = [15, 12, 9]
    
    0 讨论(0)
提交回复
热议问题