Effificient distance-like matrix computation (manual metric function)

前端 未结 1 668
醉话见心
醉话见心 2021-01-07 13:02

I want to compute a \"distance\" matrix, similarly to scipy.spatial.distance.cdist, but using the intersection over union (IoU) between \"bounding boxes\" (4-dimensional vec

相关标签:
1条回答
  • 2021-01-07 13:20

    If you want a performant solution you could use cython or numba. With both it is quite straight forward to outperform your cdist approach by 3 orders of magnitude.

    Template function

    For other distance functions like Minkowski distance I have written an answer a month ago.

    import numpy as np
    import numba as nb
    from scipy.spatial.distance import cdist
    
    def gen_cust_dist_func(kernel,parallel=True):
    
        kernel_nb=nb.njit(kernel,fastmath=True)
    
        def cust_dot_T(A,B):
            assert B.shape[1]==A.shape[1]
    
            out=np.empty((A.shape[0],B.shape[0]),dtype=np.float64)
            for i in nb.prange(A.shape[0]):
                for j in range(B.shape[0]):
                    out[i,j]=kernel_nb(A[i,:],B[j,:])
            return out
    
        if parallel==True:
            return nb.njit(cust_dot_T,fastmath=True,parallel=True)
        else:
            return nb.njit(cust_dot_T,fastmath=True,parallel=False)
    

    Example with your custom function

    def compute_iou(bbox_a, bbox_b):
        xA = max(bbox_a[0], bbox_b[0])
        yA = max(bbox_a[1], bbox_b[1])
        xB = min(bbox_a[2], bbox_b[2])
        yB = min(bbox_a[3], bbox_b[3])
    
        interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)
        boxAArea = (bbox_a[2] - bbox_a[0] + 1) * (bbox_a[3] - bbox_a[1] + 1)
        boxBArea = (bbox_b[2] - bbox_b[0] + 1) * (bbox_b[3] - bbox_b[1] + 1)
    
        iou = interArea / float(boxAArea + boxBArea - interArea)
    
        return iou
    
    #generarte custom distance function
    cust_dist=gen_cust_dist_func(compute_iou,parallel=True)
    
    
    A_bboxes = np.array([[0, 0, 10, 10], [5, 5, 15, 15]]*100)
    B_bboxes = np.array([[1, 1, 11, 11], [4, 4, 13, 13], [9, 9, 13, 13]]*1000)
    
    %timeit cust_dist(A_bboxes,B_bboxes)
    #1.74 ms ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit cdist(A_bboxes, B_bboxes, lambda u, v: compute_iou(u, v))
    #3.33 s ± 11.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    0 讨论(0)
提交回复
热议问题