How can the Euclidean distance be calculated with NumPy?

后端 未结 22 939
春和景丽
春和景丽 2020-11-22 02:29

I have two points in 3D:

(xa, ya, za)
(xb, yb, zb)

And I want to calculate the distance:

dist = sqrt((xa-xb)^2 + (ya-yb)^2 + (         


        
相关标签:
22条回答
  • 2020-11-22 02:41
    import numpy as np
    # any two python array as two points
    a = [0, 0]
    b = [3, 4]
    

    You first change list to numpy array and do like this: print(np.linalg.norm(np.array(a) - np.array(b))). Second method directly from python list as: print(np.linalg.norm(np.subtract(a,b)))

    0 讨论(0)
  • 2020-11-22 02:43

    Use numpy.linalg.norm:

    dist = numpy.linalg.norm(a-b)
    

    You can find the theory behind this in Introduction to Data Mining

    This works because the Euclidean distance is the l2 norm, and the default value of the ord parameter in numpy.linalg.norm is 2.

    0 讨论(0)
  • 2020-11-22 02:43

    Starting Python 3.8, the math module directly provides the dist function, which returns the euclidean distance between two points (given as tuples or lists of coordinates):

    from math import dist
    
    dist((1, 2, 6), (-2, 3, 2)) # 5.0990195135927845
    

    And if you're working with lists:

    dist([1, 2, 6], [-2, 3, 2]) # 5.0990195135927845
    
    0 讨论(0)
  • 2020-11-22 02:44
    import math
    
    dist = math.hypot(math.hypot(xa-xb, ya-yb), za-zb)
    
    0 讨论(0)
  • 2020-11-22 02:45

    There's a function for that in SciPy. It's called Euclidean.

    Example:

    from scipy.spatial import distance
    a = (1, 2, 3)
    b = (4, 5, 6)
    dst = distance.euclidean(a, b)
    
    0 讨论(0)
  • 2020-11-22 02:47

    A nice one-liner:

    dist = numpy.linalg.norm(a-b)
    

    However, if speed is a concern I would recommend experimenting on your machine. I've found that using math library's sqrt with the ** operator for the square is much faster on my machine than the one-liner NumPy solution.

    I ran my tests using this simple program:

    #!/usr/bin/python
    import math
    import numpy
    from random import uniform
    
    def fastest_calc_dist(p1,p2):
        return math.sqrt((p2[0] - p1[0]) ** 2 +
                         (p2[1] - p1[1]) ** 2 +
                         (p2[2] - p1[2]) ** 2)
    
    def math_calc_dist(p1,p2):
        return math.sqrt(math.pow((p2[0] - p1[0]), 2) +
                         math.pow((p2[1] - p1[1]), 2) +
                         math.pow((p2[2] - p1[2]), 2))
    
    def numpy_calc_dist(p1,p2):
        return numpy.linalg.norm(numpy.array(p1)-numpy.array(p2))
    
    TOTAL_LOCATIONS = 1000
    
    p1 = dict()
    p2 = dict()
    for i in range(0, TOTAL_LOCATIONS):
        p1[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
        p2[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
    
    total_dist = 0
    for i in range(0, TOTAL_LOCATIONS):
        for j in range(0, TOTAL_LOCATIONS):
            dist = fastest_calc_dist(p1[i], p2[j]) #change this line for testing
            total_dist += dist
    
    print total_dist
    

    On my machine, math_calc_dist runs much faster than numpy_calc_dist: 1.5 seconds versus 23.5 seconds.

    To get a measurable difference between fastest_calc_dist and math_calc_dist I had to up TOTAL_LOCATIONS to 6000. Then fastest_calc_dist takes ~50 seconds while math_calc_dist takes ~60 seconds.

    You can also experiment with numpy.sqrt and numpy.square though both were slower than the math alternatives on my machine.

    My tests were run with Python 2.6.6.

    0 讨论(0)
提交回复
热议问题