Python vectorizing nested for loops

前端 未结 2 746
迷失自我
迷失自我 2020-11-28 07:12

I\'d appreciate some help in finding and understanding a pythonic way to optimize the following array manipulations in nested for loops:

def _func(a, b, radi         


        
相关标签:
2条回答
  • 2020-11-28 07:53

    Approach #1

    Here's a vectorized approach -

    m,n,r = volume.shape
    x,y,z = np.mgrid[0:m,0:n,0:r]
    X = x - roi[0]
    Y = y - roi[1]
    Z = z - roi[2]
    mask = X**2 + Y**2 + Z**2 < radius**2
    

    Possible improvement : We can probably speedup the last step with numexpr module -

    import numexpr as ne
    
    mask = ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')
    

    Approach #2

    We can also gradually build the three ranges corresponding to the shape parameters and perform the subtraction against the three elements of roi on the fly without actually creating the meshes as done earlier with np.mgrid. This would be benefited by the use of broadcasting for efficiency purposes. The implementation would look like this -

    m,n,r = volume.shape
    vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \
           ((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)
    mask = vals < radius**2
    

    Simplified version : Thanks to @Bi Rico for suggesting an improvement here as we can use np.ogrid to perform those operations in a bit more concise manner, like so -

    m,n,r = volume.shape    
    x,y,z = np.ogrid[0:m,0:n,0:r]-roi
    mask = (x**2+y**2+z**2) < radius**2
    

    Runtime test

    Function definitions -

    def vectorized_app1(volume, roi, radius):
        m,n,r = volume.shape
        x,y,z = np.mgrid[0:m,0:n,0:r]
        X = x - roi[0]
        Y = y - roi[1]
        Z = z - roi[2]
        return X**2 + Y**2 + Z**2 < radius**2
    
    def vectorized_app1_improved(volume, roi, radius):
        m,n,r = volume.shape
        x,y,z = np.mgrid[0:m,0:n,0:r]
        X = x - roi[0]
        Y = y - roi[1]
        Z = z - roi[2]
        return ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')
    
    def vectorized_app2(volume, roi, radius):
        m,n,r = volume.shape
        vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \
               ((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)
        return vals < radius**2
    
    def vectorized_app2_simplified(volume, roi, radius):
        m,n,r = volume.shape    
        x,y,z = np.ogrid[0:m,0:n,0:r]-roi
        return (x**2+y**2+z**2) < radius**2
    

    Timings -

    In [106]: # Setup input arrays  
         ...: volume = np.random.rand(90,110,100) # Half of original input sizes 
         ...: roi = np.random.rand(3)
         ...: radius = 3.4
         ...: 
    
    In [107]: %timeit _make_mask(volume, roi, radius)
    1 loops, best of 3: 41.4 s per loop
    
    In [108]: %timeit vectorized_app1(volume, roi, radius)
    10 loops, best of 3: 62.3 ms per loop
    
    In [109]: %timeit vectorized_app1_improved(volume, roi, radius)
    10 loops, best of 3: 47 ms per loop
    
    In [110]: %timeit vectorized_app2(volume, roi, radius)
    100 loops, best of 3: 4.26 ms per loop
    
    In [139]: %timeit vectorized_app2_simplified(volume, roi, radius)
    100 loops, best of 3: 4.36 ms per loop
    

    So, as always broadcasting showing its magic for a crazy almost 10,000x speedup over the original code and more than 10x better than creating meshes by using on-the-fly broadcasted operations!

    0 讨论(0)
  • 2020-11-28 07:56

    Say you first build an xyzy array:

    import itertools
    
    xyz = [np.array(p) for p in itertools.product(range(volume.shape[0]), range(volume.shape[1]), range(volume.shape[2]))]
    

    Now, using numpy.linalg.norm,

    np.linalg.norm(xyz - roi, axis=1) < radius
    

    checks whether the distance for each tuple from roi is smaller than radius.

    Finally, just reshape the result to the dimensions you need.

    0 讨论(0)
提交回复
热议问题