Most efficient way to map function over numpy array

前端 未结 11 1342
庸人自扰
庸人自扰 2020-11-22 02:13

What is the most efficient way to map a function over a numpy array? The way I\'ve been doing it in my current project is as follows:

import numpy as np 

x          


        
相关标签:
11条回答
  • 2020-11-22 02:41

    Use numpy.fromfunction(function, shape, **kwargs)

    See "https://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfunction.html"

    0 讨论(0)
  • 2020-11-22 02:44
    squares = squarer(x)
    

    Arithmetic operations on arrays are automatically applied elementwise, with efficient C-level loops that avoid all the interpreter overhead that would apply to a Python-level loop or comprehension.

    Most of the functions you'd want to apply to a NumPy array elementwise will just work, though some may need changes. For example, if doesn't work elementwise. You'd want to convert those to use constructs like numpy.where:

    def using_if(x):
        if x < 5:
            return x
        else:
            return x**2
    

    becomes

    def using_where(x):
        return numpy.where(x < 5, x, x**2)
    
    0 讨论(0)
  • 2020-11-22 02:48

    As mentioned in this post, just use generator expressions like so:

    numpy.fromiter((<some_func>(x) for x in <something>),<dtype>,<size of something>)
    
    0 讨论(0)
  • 2020-11-22 02:55

    It seems no one has mentioned a built-in factory method of producing ufunc in numpy package: np.frompyfunc which I have tested again np.vectorize and have outperformed it by about 20~30%. Of course it will perform well as prescribed C code or even numba(which I have not tested), but it can a better alternative than np.vectorize

    f = lambda x, y: x * y
    f_arr = np.frompyfunc(f, 2, 1)
    vf = np.vectorize(f)
    arr = np.linspace(0, 1, 10000)
    
    %timeit f_arr(arr, arr) # 307ms
    %timeit vf(arr, arr) # 450ms
    

    I have also tested larger samples, and the improvement is proportional. See the documentation also here

    0 讨论(0)
  • 2020-11-22 02:56

    All above answers compares well, but if you need to use custom function for mapping, and you have numpy.ndarray, and you need to retain the shape of array.

    I have compare just two, but it will retain the shape of ndarray. I have used the array with 1 million entries for comparison. Here I use square function, which is also inbuilt in numpy and has great performance boost, since there as was need of something, you can use function of your choice.

    import numpy, time
    def timeit():
        y = numpy.arange(1000000)
        now = time.time()
        numpy.array([x * x for x in y.reshape(-1)]).reshape(y.shape)        
        print(time.time() - now)
        now = time.time()
        numpy.fromiter((x * x for x in y.reshape(-1)), y.dtype).reshape(y.shape)
        print(time.time() - now)
        now = time.time()
        numpy.square(y)  
        print(time.time() - now)
    

    Output

    >>> timeit()
    1.162431240081787    # list comprehension and then building numpy array
    1.0775556564331055   # from numpy.fromiter
    0.002948284149169922 # using inbuilt function
    

    here you can clearly see numpy.fromiter works great considering to simple approach, and if inbuilt function is available please use that.

    0 讨论(0)
提交回复
热议问题