I have two points in 3D:
(xa, ya, za)
(xb, yb, zb)
And I want to calculate the distance:
dist = sqrt((xa-xb)^2 + (ya-yb)^2 + (
import numpy as np
# any two python array as two points
a = [0, 0]
b = [3, 4]
You first change list to numpy array and do like this: print(np.linalg.norm(np.array(a) - np.array(b)))
. Second method directly from python list as: print(np.linalg.norm(np.subtract(a,b)))
Use numpy.linalg.norm:
dist = numpy.linalg.norm(a-b)
You can find the theory behind this in Introduction to Data Mining
This works because the Euclidean distance is the l2 norm, and the default value of the ord parameter in numpy.linalg.norm
is 2.
Starting Python 3.8
, the math module directly provides the dist function, which returns the euclidean distance between two points (given as tuples or lists of coordinates):
from math import dist
dist((1, 2, 6), (-2, 3, 2)) # 5.0990195135927845
And if you're working with lists:
dist([1, 2, 6], [-2, 3, 2]) # 5.0990195135927845
import math
dist = math.hypot(math.hypot(xa-xb, ya-yb), za-zb)
There's a function for that in SciPy. It's called Euclidean.
Example:
from scipy.spatial import distance
a = (1, 2, 3)
b = (4, 5, 6)
dst = distance.euclidean(a, b)
A nice one-liner:
dist = numpy.linalg.norm(a-b)
However, if speed is a concern I would recommend experimenting on your machine. I've found that using math
library's sqrt
with the **
operator for the square is much faster on my machine than the one-liner NumPy solution.
I ran my tests using this simple program:
#!/usr/bin/python
import math
import numpy
from random import uniform
def fastest_calc_dist(p1,p2):
return math.sqrt((p2[0] - p1[0]) ** 2 +
(p2[1] - p1[1]) ** 2 +
(p2[2] - p1[2]) ** 2)
def math_calc_dist(p1,p2):
return math.sqrt(math.pow((p2[0] - p1[0]), 2) +
math.pow((p2[1] - p1[1]), 2) +
math.pow((p2[2] - p1[2]), 2))
def numpy_calc_dist(p1,p2):
return numpy.linalg.norm(numpy.array(p1)-numpy.array(p2))
TOTAL_LOCATIONS = 1000
p1 = dict()
p2 = dict()
for i in range(0, TOTAL_LOCATIONS):
p1[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
p2[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
total_dist = 0
for i in range(0, TOTAL_LOCATIONS):
for j in range(0, TOTAL_LOCATIONS):
dist = fastest_calc_dist(p1[i], p2[j]) #change this line for testing
total_dist += dist
print total_dist
On my machine, math_calc_dist
runs much faster than numpy_calc_dist
: 1.5 seconds versus 23.5 seconds.
To get a measurable difference between fastest_calc_dist
and math_calc_dist
I had to up TOTAL_LOCATIONS
to 6000. Then fastest_calc_dist
takes ~50 seconds while math_calc_dist
takes ~60 seconds.
You can also experiment with numpy.sqrt
and numpy.square
though both were slower than the math
alternatives on my machine.
My tests were run with Python 2.6.6.