How can the Euclidean distance be calculated with NumPy?

后端 未结 22 954
春和景丽
春和景丽 2020-11-22 02:29

I have two points in 3D:

(xa, ya, za)
(xb, yb, zb)

And I want to calculate the distance:

dist = sqrt((xa-xb)^2 + (ya-yb)^2 + (         


        
22条回答
  •  遥遥无期
    2020-11-22 02:37

    For anyone interested in computing multiple distances at once, I've done a little comparison using perfplot (a small project of mine).

    The first advice is to organize your data such that the arrays have dimension (3, n) (and are C-contiguous obviously). If adding happens in the contiguous first dimension, things are faster, and it doesn't matter too much if you use sqrt-sum with axis=0, linalg.norm with axis=0, or

    a_min_b = a - b
    numpy.sqrt(numpy.einsum('ij,ij->j', a_min_b, a_min_b))
    

    which is, by a slight margin, the fastest variant. (That actually holds true for just one row as well.)

    The variants where you sum up over the second axis, axis=1, are all substantially slower.


    Code to reproduce the plot:

    import numpy
    import perfplot
    from scipy.spatial import distance
    
    
    def linalg_norm(data):
        a, b = data[0]
        return numpy.linalg.norm(a - b, axis=1)
    
    
    def linalg_norm_T(data):
        a, b = data[1]
        return numpy.linalg.norm(a - b, axis=0)
    
    
    def sqrt_sum(data):
        a, b = data[0]
        return numpy.sqrt(numpy.sum((a - b) ** 2, axis=1))
    
    
    def sqrt_sum_T(data):
        a, b = data[1]
        return numpy.sqrt(numpy.sum((a - b) ** 2, axis=0))
    
    
    def scipy_distance(data):
        a, b = data[0]
        return list(map(distance.euclidean, a, b))
    
    
    def sqrt_einsum(data):
        a, b = data[0]
        a_min_b = a - b
        return numpy.sqrt(numpy.einsum("ij,ij->i", a_min_b, a_min_b))
    
    
    def sqrt_einsum_T(data):
        a, b = data[1]
        a_min_b = a - b
        return numpy.sqrt(numpy.einsum("ij,ij->j", a_min_b, a_min_b))
    
    
    def setup(n):
        a = numpy.random.rand(n, 3)
        b = numpy.random.rand(n, 3)
        out0 = numpy.array([a, b])
        out1 = numpy.array([a.T, b.T])
        return out0, out1
    
    
    perfplot.save(
        "norm.png",
        setup=setup,
        n_range=[2 ** k for k in range(22)],
        kernels=[
            linalg_norm,
            linalg_norm_T,
            scipy_distance,
            sqrt_sum,
            sqrt_sum_T,
            sqrt_einsum,
            sqrt_einsum_T,
        ],
        logx=True,
        logy=True,
        xlabel="len(x), len(y)",
    )
    

提交回复
热议问题