Convolution of two three dimensional arrays with padding on one side too slow

前端 未结 3 1042
离开以前
离开以前 2021-02-05 18:56

In my current project I need to \"convolve\" two three dimensional arrays in a slightly unusual way:

Assume we have two three dimensional arrays A and B

3条回答
  •  -上瘾入骨i
    2021-02-05 19:13

    Have you tried using Numba? It's a package that allows you to wrap Python code that is usually slow with a JIT compiler. I took a quick stab at your problem using Numba and got a significant speed up. Using IPython's magic timeit magic function, the custom_convolution function took ~18 s, while Numba's optimized function took 10.4 ms. That's a speed up of more than 1700.

    Here's how Numba is implemented.

    import numpy as np
    from numba import jit, double
    
    s = 15
    array_a = np.random.rand(s ** 3).reshape(s, s, s)
    array_b = np.random.rand(s ** 3).reshape(s, s, s)
    
    # Original code
    def custom_convolution(A, B):
    
        dimA = A.shape[0]
        dimB = B.shape[0]
        dimC = dimA + dimB
    
        C = np.zeros((dimC, dimC, dimC))
        for x1 in range(dimA):
            for x2 in range(dimB):
                for y1 in range(dimA):
                    for y2 in range(dimB):
                        for z1 in range(dimA):
                            for z2 in range(dimB):
                                x = x1 + x2
                                y = y1 + y2
                                z = z1 + z2
                                C[x, y, z] += A[x1, y1, z1] * B[x2, y2, z2]
        return C
    
    # Numba'ing the function with the JIT compiler
    fast_convolution = jit(double[:, :, :](double[:, :, :],
                            double[:, :, :]))(custom_convolution)
    

    If you compute the residual between the results of both functions you will get zeros. This means that the JIT implementation is working without any problems.

    slow_result = custom_convolution(array_a, array_b) 
    fast_result = fast_convolution(array_a, array_b)
    
    print np.max(np.abs(slow_result - fast_result))
    

    The output I get for this is 0.0.

    You can either install Numba into your current Python setup or try it quickly with the AnacondaCE package from continuum.io.

    Last but not least, Numba's function is faster than the scipy.signal.fftconvolve function by a factor of a few.

    Note: I'm using Anaconda and not AnacondaCE. There are some differences between the two packages for Numba's performance, but I don't think it will differ too much.

提交回复
热议问题