Question:
I\'ve profiled my Python program to death, and there is one function that is slowing everything down. It uses Python dictionaries heavily, so
node_after_b == node_a
will try to call node_after_b.__eq__(node_a)
:
>>> class B(object):
... def __eq__(self, other):
... print "B.__eq__()"
... return False
...
>>> class A(object):
... def __eq__(self, other):
... print "A.__eq__()"
... return False
...
>>> a = A()
>>> b = B()
>>> a == b
A.__eq__()
False
>>> b == a
B.__eq__()
False
>>>
Try to override Node.__eq__()
with an optimized version before resorting to C.
UPDATE
I made this little experiment (python 2.6.6):
#!/usr/bin/env python
# test.py
class A(object):
def __init__(self, id):
self.id = id
class B(A):
def __eq__(self, other):
return self.id == other.id
@profile
def main():
list_a = []
list_b = []
for x in range(100000):
list_a.append(A(x))
list_b.append(B(x))
ob_a = A(1)
ob_b = B(1)
for ob in list_a:
if ob == ob_a:
x = True
if ob is ob_a:
x = True
if ob.id == ob_a.id:
x = True
if ob.id == 1:
x = True
for ob in list_b:
if ob == ob_b:
x = True
if ob is ob_b:
x = True
if ob.id == ob_b.id:
x = True
if ob.id == 1:
x = True
if __name__ == '__main__':
main()
Results:
Timer unit: 1e-06 s
File: test.py Function: main at line 10 Total time: 5.52964 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
10 @profile
11 def main():
12 1 5 5.0 0.0 list_a = []
13 1 3 3.0 0.0 list_b = []
14 100001 360677 3.6 6.5 for x in range(100000):
15 100000 763593 7.6 13.8 list_a.append(A(x))
16 100000 924822 9.2 16.7 list_b.append(B(x))
17
18 1 14 14.0 0.0 ob_a = A(1)
19 1 5 5.0 0.0 ob_b = B(1)
20 100001 500454 5.0 9.1 for ob in list_a:
21 100000 267252 2.7 4.8 if ob == ob_a:
22 x = True
23 100000 259075 2.6 4.7 if ob is ob_a:
24 x = True
25 100000 539683 5.4 9.8 if ob.id == ob_a.id:
26 1 3 3.0 0.0 x = True
27 100000 271519 2.7 4.9 if ob.id == 1:
28 1 3 3.0 0.0 x = True
29 100001 296736 3.0 5.4 for ob in list_b:
30 100000 472204 4.7 8.5 if ob == ob_b:
31 1 4 4.0 0.0 x = True
32 100000 283165 2.8 5.1 if ob is ob_b:
33 x = True
34 100000 298839 3.0 5.4 if ob.id == ob_b.id:
35 1 3 3.0 0.0 x = True
36 100000 291576 2.9 5.3 if ob.id == 1:
37 1 3 3.0 0.0 x = True
I was very surprised:
Then I tried with more complex objects and results are consistent with the first experiment.
Are you swapping a lot? If your dataset is so large that it does not fit available RAM, I guess you may experience some kind of I/O contention related to virtual memory fetches.
Are you running Linux? If so, could you post a vmstat of your machine while running your program? Send us the output of something like:
vmstat 10 100
Good luck!
UPDATE (from comments by OP)
I sugested playing with sys.setcheckinterval and enable/disable the GC. The rationale is that for this particular case (huge number of instances) the default GC reference count check is somewhat expensive and its default interval is away too often.
Yes, I had previously played with sys.setcheckinterval. I changed it to 1000 (from its default of 100), but it didn't do any measurable difference. Disabling Garbage Collection has helped - thanks. This has been the biggest speedup so far - saving about 20% (171 minutes for the whole run, down to 135 minutes) - I'm not sure what the error bars are on that, but it must be a statistically significant increase. – Adam Nellis Feb 9 at 15:10
My guess:
I think the Python GC is based on reference count. From time to time it will check the reference count for every instance; since you are traversing these huge in-memory structures, in your particular case the GC default frequency (1000 cycles?) is away too often - a huge waste. – Yours Truly Feb 10 at 2:06