I have a matrix a
and I want to calculate the distance from one point to all other points. So really the outcome matrix should have a zero (at
This is what i was looking for, but thanks for all the suggestions.
A = rand(5, 5);
select_cell = [3 3];
distance = zeros(size(A, 1), size(A, 2));
for i = 1:size(A, 1)
for j = 1:size(A, 2)
distance(i, j) = sqrt((i - select_cell(1))^2 + (j - select_cell(2))^2);
end
end
disp(distance)
Also you can improve it by using vectorisation:
distances = sqrt((x-xCenter).^2+(y-yCenter).^2
Your a
matrix is a 1D vector and is incompatible with the nested loop, which computes distance in 2D space from each point to each other point. So the following answer applies to the problem of finding all pairwise distances in a N-by-D
matrix, as your loop does for the case of D=2
.
I think you are looking for pdist with the 'euclidean'
distance option.
a = randn(10, 2); %// 2D, 10 samples
D = pdist(a,'euclidean'); %// euclidean distance
Follow that by squareform to get the square matrix with zero on the diagonal as you want it:
distances = squareform(D);
If you don't have pdist
, which is in the Statistics Toolbox, you can do this easily with bsxfun
:
da = bsxfun(@minus,a,permute(a,[3 2 1]));
distances = squeeze(sqrt(sum(da.^2,2)));
You can also use an alternate form of Euclidean (2-norm) distance,
||A-B|| = sqrt ( ||A||^2 + ||B||^2 - 2*A.B )
Writing this in MATLAB for two data arrays u
and v
of size NxD
,
dot(u-v,u-v,2) == dot(u,u,2) + dot(v,v,2) - 2*dot(u,v,2) % useful identity
%// there are actually small differences from floating point precision, but...
abs(dot(u-v,u-v,2) - (dot(u,u,2) + dot(v,v,2) - 2*dot(u,v,2))) < 1e-15
With the reformulated equation, the solution becomes:
aa = a*a';
a2 = sum(a.*a,2); % diag(aa)
a2 = bsxfun(@plus,a2,a2');
distances = sqrt(a2 - 2*aa);
You might use this method if Option 2 eats up too much memory.
For a random data matrix of size 1e3-by-3 (N-by-D), here are timings for 100 runs (Core 2 Quad, 4GB DDR2, R2013a).
pdist
): 1.561150 sec (0.560947 sec in pdist
)bsxfun
): 2.695059 secbsxfun
alt): 1.334880 secFindings: (i) Do computations with bsxfun
, use the alternate formula. (ii) the pdist
+squareform
option has comparable performance. (iii) The reason why squareform
takes twice as much time as pdist
is probably because pdist
only computes the triangular matrix since the distance matrix is symmetric. If you can do without the square matrix, then you can avoid squareform
and do your computations in about 40% of the time required to do it manually with bsxfun
(0.5609/1.3348).
IMPORTANT: data_matrix is D X N, where D is number of dimensions and N is number of data points!
final_dist_pairs=data_matrix'*data_matrix;
norms = diag(final_dist_pairs);
final_dist_pairs = bsxfun(@plus, norms, norms') - 2 * final_dist_pairs; Hope it helps!
% Another important thing, Never use pdist function of MATLAB. It is a sequential evaluation, that is something like for loops and takes a lot of time, maybe in O(N^2)