Question:
I\'ve profiled my Python program to death, and there is one function that is slowing everything down. It uses Python dictionaries heavily, so
I would have posted this as an update to my question, but Stack Overflow only allows 30000 characters in questions, so I'm posting this as an answer.
Update: My best optimisations so far
I've taken on board people's suggestions, and now my code runs about 21% faster than before, which is good - thanks everyone!
This is the best I've managed to do so far. I've replaced all the ==
tests with is
for nodes, disabled garbage collection and re-written the big if
statement part at Line 388, in line with @Nas Banov's suggestions. I added in the well-known try/except
trick for avoiding tests (line 390 - to remove the test node_c in node_b_distances
), which helped loads, since it hardly ever throws the exception. I tried switching lines 391 and 392 around, and assigning node_b_distances[node_c]
to a variable, but this way was the quickest.
However, I still haven't tracked down the memory leak yet (see graph in my question). But I think this might be in a different part of my code (that I haven't posted here). If I can fix the memory leak, then this program will run quickly enough for me to use :)
Timer unit: 3.33366e-10 s
File: routing_distances.py
Function: propagate_distances_node at line 328
Total time: 760.74 s
Line # Hits Time Per Hit % Time Line Contents
328 @profile
329 def propagate_distances_node(self, node_a, cutoff_distance=200):
330
331 # a makes sure its immediate neighbours are correctly in its distance table
332 # because its immediate neighbours may change as binds/folding change
333 791349 4158169713 5254.5 0.2 for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
334 550522 2331886050 4235.8 0.1 use_neighbour_link = False
335
336 550522 2935995237 5333.1 0.1 if(node_b not in self.node_distances[node_a]): # a doesn't know distance to b
337 15931 68829156 4320.5 0.0 use_neighbour_link = True
338 else: # a does know distance to b
339 534591 2728134153 5103.2 0.1 (node_distance_b_a, next_node) = self.node_distances[node_a][node_b]
340 534591 2376374859 4445.2 0.1 if(node_distance_b_a > neighbour_distance_b_a): # neighbour distance is shorter
341 78 347355 4453.3 0.0 use_neighbour_link = True
342 534513 3145889079 5885.5 0.1 elif((None is next_node) and (float('+inf') == neighbour_distance_b_a)): # direct route that has just broken
343 74 327600 4427.0 0.0 use_neighbour_link = True
344
345 550522 2414669022 4386.1 0.1 if(use_neighbour_link):
346 16083 81850626 5089.3 0.0 self.node_distances[node_a][node_b] = (neighbour_distance_b_a, None)
347 16083 87064200 5413.4 0.0 self.nodes_changed.add(node_a)
348
349 ## Affinity distances update
350 16083 86580603 5383.4 0.0 if((node_a.type == Atom.BINDING_SITE) and (node_b.type == Atom.BINDING_SITE)):
351 234 6656868 28448.2 0.0 self.add_affinityDistance(node_a, node_b, self.chemistry.affinity(node_a.data, node_b.data))
352
353 # a sends its table to all its immediate neighbours
354 791349 4034651958 5098.4 0.2 for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].iteritems():
355 550522 2392248546 4345.4 0.1 node_b_changed = False
356
357 # b integrates a's distance table with its own
358 550522 2520330696 4578.1 0.1 node_b_chemical = node_b.chemical
359 550522 2734341975 4966.8 0.1 node_b_distances = node_b_chemical.node_distances[node_b]
360
361 # For all b's routes (to c) that go to a first, update their distances
362 46679347 222161837193 4759.3 9.7 for node_c, (distance_b_c, node_after_b) in node_b_distances.iteritems(): # Think it's ok to modify items while iterating over them (just not insert/delete) (seems to work ok)
363 46128825 211963639122 4595.0 9.3 if(node_after_b is node_a):
364
365 18677439 79225517916 4241.8 3.5 try:
366 18677439 101527287264 5435.8 4.4 distance_b_a_c = neighbour_distance_b_a + self.node_distances[node_a][node_c][0]
367 181510 985441680 5429.1 0.0 except KeyError:
368 181510 1166118921 6424.5 0.1 distance_b_a_c = float('+inf')
369
370 18677439 89626381965 4798.6 3.9 if(distance_b_c != distance_b_a_c): # a's distance to c has changed
371 692131 3352970709 4844.4 0.1 node_b_distances[node_c] = (distance_b_a_c, node_a)
372 692131 3066946866 4431.2 0.1 node_b_changed = True
373
374 ## Affinity distances update
375 692131 3808548270 5502.6 0.2 if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
376 96794 1655818011 17106.6 0.1 node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
377
378 # If distance got longer, then ask b's neighbours to update
379 ## TODO: document this!
380 18677439 88838493705 4756.5 3.9 if(distance_b_a_c > distance_b_c):
381 #for (node, neighbour_distance) in node_b_chemical.neighbours[node_b].iteritems():
382 1656796 7949850642 4798.3 0.3 for node in node_b_chemical.neighbours[node_b]:
383 1172486 6307264854 5379.4 0.3 node.chemical.nodes_changed.add(node)
384
385 # Look for routes from a to c that are quicker than ones b knows already
386 46999631 227198060532 4834.0 10.0 for node_c, (distance_a_c, node_after_a) in self.node_distances[node_a].iteritems():
387
388 46449109 218024862372 4693.8 9.6 if((node_after_a is not node_b) and # not a-b-a-b path
389 28049321 126269403795 4501.7 5.5 (node_c is not node_b)): # not a-b path
390 27768341 121588366824 4378.7 5.3 try: # Assume node_c in node_b_distances ('try' block will raise KeyError if not)
391 27768341 159413637753 5740.8 7.0 if((node_b_distances[node_c][1] is not node_a) and # b doesn't already go to a first
392 8462467 51890478453 6131.8 2.3 ((neighbour_distance_b_a + distance_a_c) < node_b_distances[node_c][0])):
393
394 # Found a route
395 224593 1168129548 5201.1 0.1 node_b_distances[node_c] = (neighbour_distance_b_a + distance_a_c, node_a)
396 ## Affinity distances update
397 224593 1274631354 5675.3 0.1 if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
398 32108 551523249 17177.1 0.0 node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
399 224593 1165878108 5191.1 0.1 node_b_changed = True
400
401 809945 4449080808 5493.1 0.2 except KeyError:
402 # b can't already get to c (node_c not in node_b_distances)
403 809945 4208032422 5195.5 0.2 if((neighbour_distance_b_a + distance_a_c) < cutoff_distance): # not too for to go
404
405 # These lines of code copied, for efficiency
406 # (most of the time, the 'try' block succeeds, so don't bother testing for (node_c in node_b_distances))
407 # Found a route
408 587726 3162939543 5381.7 0.1 node_b_distances[node_c] = (neighbour_distance_b_a + distance_a_c, node_a)
409 ## Affinity distances update
410 587726 3363869061 5723.5 0.1 if((node_b.type == Atom.BINDING_SITE) and (node_c.type == Atom.BINDING_SITE)):
411 71659 1258910784 17568.1 0.1 node_b_chemical.add_affinityDistance(node_b, node_c, self.chemistry.affinity(node_b.data, node_c.data))
412 587726 2706161481 4604.5 0.1 node_b_changed = True
413
414
415
416 # If any of node b's rows have exceeded the cutoff distance, then remove them
417 47267073 239847142446 5074.3 10.5 for node_c, (distance_b_c, node_after_b) in node_b_distances.items(): # Can't use iteritems() here, as deleting from the dictionary
418 46716551 242694352980 5195.0 10.6 if(distance_b_c > cutoff_distance):
419 200755 967443975 4819.0 0.0 del node_b_distances[node_c]
420 200755 930470616 4634.9 0.0 node_b_changed = True
421
422 ## Affinity distances update
423 200755 4717125063 23496.9 0.2 node_b_chemical.del_affinityDistance(node_b, node_c)
424
425 # If we've modified node_b's distance table, tell its chemical to update accordingly
426 550522 2684634615 4876.5 0.1 if(node_b_changed):
427 235034 1383213780 5885.2 0.1 node_b_chemical.nodes_changed.add(node_b)
428
429 # Remove any neighbours that have infinite distance (have just unbound)
430 ## TODO: not sure what difference it makes to do this here rather than above (after updating self.node_distances for neighbours)
431 ## but doing it above seems to break the walker's movement
432 791349 4367879451 5519.5 0.2 for (node_b, neighbour_distance_b_a) in self.neighbours[node_a].items(): # Can't use iteritems() here, as deleting from the dictionary
433 550522 2968919613 5392.9 0.1 if(neighbour_distance_b_a > cutoff_distance):
434 148 775638 5240.8 0.0 del self.neighbours[node_a][node_b]
435
436 ## Affinity distances update
437 148 2096343 14164.5 0.0 self.del_affinityDistance(node_a, node_b)