Optimized dot product in Python

后端 未结 4 1454
醉梦人生
醉梦人生 2020-12-30 07:29

The dot product of two n-dimensional vectors u=[u1,u2,...un] and v=[v1,v2,...,vn] is is given by u1*v1 + u2*v2 + ... + un*vn.

相关标签:
4条回答
  • 2020-12-30 07:35

    Please benchmark this "d2a" function, and compare it to your "d3" function.

    from itertools import imap, starmap
    from operator import mul
    
    def d2a(v1,v2):
        """
        d2a uses itertools.imap
        """
        check(v1,v2)
        return sum(imap(mul, v1, v2))
    

    map (on Python 2.x, which is what I assume you use) unnecessarily creates a dummy list prior to the computation.

    0 讨论(0)
  • 2020-12-30 07:42

    Here is a comparison with numpy. We compare the fast starmap approach with numpy.dot

    First, iteration over normal Python lists:

    $ python -mtimeit "import numpy as np; r = range(100)" "np.dot(r,r)"
    10 loops, best of 3: 316 usec per loop
    
    $ python -mtimeit "import operator; r = range(100); from itertools import izip, starmap" "sum(starmap(operator.mul, izip(r,r)))"
    10000 loops, best of 3: 81.5 usec per loop
    

    Then numpy ndarray:

    $ python -mtimeit "import numpy as np; r = np.arange(100)" "np.dot(r,r)"
    10 loops, best of 3: 20.2 usec per loop
    
    $ python -mtimeit "import operator; import numpy as np; r = np.arange(100); from itertools import izip, starmap;" "sum(starmap(operator.mul, izip(r,r)))"
    10 loops, best of 3: 405 usec per loop
    

    Seeing this, it seems numpy on numpy arrays is fastest, followed by python functional constructs working with lists.

    0 讨论(0)
  • 2020-12-30 07:49

    I don't know about faster, but I'd suggest:

    sum(i*j for i, j in zip(v1, v2))
    

    it's much easier to read and doesn't require even standard-library modules.

    0 讨论(0)
  • 2020-12-30 07:54

    Just for fun I wrote a "d4" which uses numpy:

    from numpy import dot
    def d4(v1, v2): 
        check(v1, v2)
        return dot(v1, v2)
    

    My results (Python 2.5.1, XP Pro sp3, 2GHz Core2 Duo T7200):

    d0 elapsed:  12.1977242918
    d1 elapsed:  13.885232341
    d2 elapsed:  13.7929552499
    d3 elapsed:  11.0952246724
    

    d4 elapsed: 56.3278584289 # go numpy!

    And, for even more fun, I turned on psyco:

    d0 elapsed:  0.965477735299
    d1 elapsed:  12.5354792299
    d2 elapsed:  12.9748163524
    d3 elapsed:  9.78255448667
    

    d4 elapsed: 54.4599059378

    Based on that, I declare d0 the winner :)


    Update

    @kaiser.se: I probably should have mentioned that I did convert everything to numpy arrays first:

    from numpy import array
    v3 = [array(vec) for vec in v1]
    v4 = [array(vec) for vec in v2]
    
    # then
    t4 = timeit.Timer("d4(v3,v4)","from dot_product import d4,v3,v4")
    

    And I included check(v1, v2) since it's included in the other tests. Leaving it off would give numpy an unfair advantage (though it looks like it could use one). The array conversion shaved off about a second (much less than I thought it would).

    All of my tests were run with N=50.

    @nikow: I'm using numpy 1.0.4, which is undoubtedly old, it's certainly possible that they've improved performance over the last year and a half since I've installed it.


    Update #2

    @kaiser.se Wow, you are totally right. I must have been thinking that these were lists of lists or something (I really have no idea what I was thinking ... +1 for pair programming).

    How does this look:

    v3 = array(v1)
    v4 = array(v2)
    

    New results:

    d4 elapsed:  3.22535741274
    

    With Psyco:

    d4 elapsed:  2.09182619579
    

    d0 still wins with Psyco, but numpy is probably better overall, especially with larger data sets.

    Yesterday I was a bit bothered my slow numpy result, since presumably numpy is used for a lot of computation and has had a lot of optimization. Obviously though, not bothered enough to check my result :)

    0 讨论(0)
提交回复
热议问题