I am new to Python and I need to implement a clustering algorithm. For that, I will need to calculate distances between the given input data.
Consider the following inpu
From this thread's you can use the e_dist function there and also obtain the same results.
Addendum
Timing: on my memory starved laptop, I can only do a comparison to a smaller sample than @Psidom 's using his norm_app function.
a = np.random.randint(0,9,(5000,3))
%timeit norm_app(a) 1.91 s ± 13.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit e_dist(a, a) 631 ms ± 3.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
a
array([[1, 2, 8],
[7, 4, 2],
[9, 1, 7],
[0, 1, 5],
[6, 4, 3]])
dm = e_dist(a, a) # get the def from the link
dm
Out[7]:
array([[ 0. , 8.72, 8.12, 3.32, 7.35],
[ 8.72, 0. , 6.16, 8.19, 1.41],
[ 8.12, 6.16, 0. , 9.22, 5.83],
[ 3.32, 8.19, 9.22, 0. , 7. ],
[ 7.35, 1.41, 5.83, 7. , 0. ]])
idx = np.argsort(dm)
closest = a[idx]
closest
Out[10]:
array([[[1, 2, 8],
[0, 1, 5],
[6, 4, 3],
[9, 1, 7],
[7, 4, 2]],
[[7, 4, 2],
[6, 4, 3],
[9, 1, 7],
[0, 1, 5],
[1, 2, 8]],
[[9, 1, 7],
[6, 4, 3],
[7, 4, 2],
[1, 2, 8],
[0, 1, 5]],
[[0, 1, 5],
[1, 2, 8],
[6, 4, 3],
[7, 4, 2],
[9, 1, 7]],
[[6, 4, 3],
[7, 4, 2],
[9, 1, 7],
[0, 1, 5],
[1, 2, 8]]])