Optimising Python dictionary access code

后端 未结 5 1744
迷失自我
迷失自我 2021-01-30 05:11

Question:

I\'ve profiled my Python program to death, and there is one function that is slowing everything down. It uses Python dictionaries heavily, so

5条回答
  •  时光说笑
    2021-01-30 05:41

    I don't see anything wrong with your code regarding performance (without trying to grok the algorithm), you are just getting hit by the big number of iterations. Parts of your code get executed 40 million times!

    Notice how 80% of the time is spent in 20% of your code - and those are the 13 lines that get executed 24+ million times. By the way with this code you provide great illustration to the Pareto principle (or "20% of beer drinkers drink 80% of the beer").

    First things first: have you tried Psycho? It's a JIT compiler that can greatly speed up your code - considering the big number of iterations - say by a factor of 4x-5x - and all you have to do (after downloading and installing, of course) is to insert this snippet in the beginning:

    import psyco
    psyco.full()
    

    This is why i liked Psycho and used it in GCJ too, where time is of essence - nothing to code, nothing to get wrong and sudden boost from 2 lines added.

    Back to nit-picking (which changes like replacing == with is etc is, because of the small % time improvement). Here they are the 13 lines "at fault":

    Line    #   Hits    Time    Per Hit % Time  Line Contents
    412 42350234    197075504439    4653.5  8.1 for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
    386 42076729    184216680432    4378.1  7.6 for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
    362 41756882    183992040153    4406.3  7.6 for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
    413 41838114    180297579789    4309.4  7.4 if(distance_b_c > cutoff_distance):
    363 41244762    172425596985    4180.5  7.1 if(node_after_b == node_a):
    389 41564609    172040284089    4139.1  7.1 if(node_c == node_b): # a-b path
    388 41564609    171150289218    4117.7  7.1 node_b_update = False
    391 41052489    169406668962    4126.6  7   elif(node_after_a == node_b): # a-b-a-b path
    405 41564609    164585250189    3959.7  6.8 if node_b_update:
    394 24004846    103404357180    4307.6  4.3 (distance_b_c, node_after_b) = node_b_distances[node_c]
    395 24004846    102717271836    4279    4.2 if(node_after_b != node_a): # b doesn't already go to a first
    393 24801082    101577038778    4095.7  4.2 elif(node_c in node_b_distances): # b can already get to c
    

    A) Besides the lines you mention, i notice that #388 has relatively high time when it is trivial, all it does it node_b_update = False. Oh but wait - each time it gets executed, False gets looked up in the global scope! To avoid that, assign F, T = False, True in th e beginning of the method and replace later uses of False and True with locals F and T. This should decrease overall time, although by little (3%?).

    B) I notice that the condition in #389 occurred "only" 512,120 times (based on number of executions of #390) vs the condition in #391 with 16,251,407. Since there is no dependency, it makes sense to reverse the order of those checks - because of the early "cut" that should give little boost (2%?). I am not sure if avoiding pass statements altogether will help but if it does not hurt readability:

    if (node_after_a is not node_b) and (node_c is not node_b):
       # neither a-b-a-b nor a-b path
       if (node_c in node_b_distances): # b can already get to c
           (distance_b_c, node_after_b) = node_b_distances[node_c]
           if (node_after_b is not node_a): # b doesn't already go to a first
               distance_b_a_c = neighbour_distance_b_a + distance_a_c
               if (distance_b_a_c < distance_b_c): # quicker to go via a
                   node_b_update = T
       else: # b can't already get to c
           distance_b_a_c = neighbour_distance_b_a + distance_a_c
           if (distance_b_a_c < cutoff_distance): # not too for to go
               node_b_update = T
    

    C) I just noticed you are using try-except in a case (#365-367) you just need default value from a dictionary - try using instead .get(key, defaultVal) or create your dictionaries with collections.defaultdict(itertools.repeat(float('+inf'))). Using try-except has it's price - see #365 reports 3.5% of the time, that's setting up stack frames and whatnot.

    D) Avoid indexed access (be it with obj.field or obj[idx]) when possible. For example i see you use self.node_distances[node_a] in multiple places (#336, 339, 346, 366, 386), which means for every use indexing is used twice (once for . and once for []) - and that gets expensive when executed tens of millions of times. Seems to me you can just do at the method beginning node_a_distances = self.node_distances[node_a] and then use that further.

提交回复
热议问题