MATLAB: bsxfun unclear. Want to accelerate minimum distance between segments

问题

Using MATLAB, Imagine a Nx6 array of numbers which represent N segments with 3+3=6 initial and end point coordinates.

Assume I have a function Calc_Dist( Segment_1, Segment_2 ) that takes as input two 1x6 arrays, and that after some operations returns a scalar, namely the minimal euclidean distance between these two segments.

I want to calculate the pairwise minimal distance between all N segments of my list, but would like to avoid a double loop to do so.

I cannot wrap my head around the documentation of the bsxfun function of MATLAB, so I cannot make this work. For the sake of a minimal example (the distance calculation is obviously not correct):

function scalar = calc_dist( segment_1, segment_2 )
         scalar = sum( segment_1 + segment_2 )
end

and the main

Segments = rand( 1500, 6 )
Pairwise_Distance_Matrix = bsxfun( @calc_dist, segments, segments' )

Is there any way to do this, or am I forced to use double loops ?

Thank you for any suggestion

回答1:

I think you need pdist rather than bsxfun. pdist can be used in two different ways, the second of which is applicable to your problem:

With built-in distance functions, supplied as strings, such as 'euclidean', 'hamming' etc.
With a custom distance function, a handle to which you supply.

In the second case, the distance function

must be of the form
     function D2 = distfun(XI, XJ),
taking as arguments a 1-by-N vector XI containing a single row of X, an M2-by-N matrix XJ containing multiple rows of X, and returning an M2-by-1 vector of distances D2, whose Jth element is the distance between the observations XI and XJ(J,:).

Although the documentation doesn't tell, it's very likely that the second way is not as efficient as the first (a double loop might even be faster, who knows), but you can use it. You would need to define your function so that it fulfills the stated condition. With your example function it's easy: for this part you'd use bsxfun:

function scalar = calc_dist( segment_1, segment_2 )
    scalar = sum(bsxfun(@plus, segment_1, segment_2), 2);
end

Note also that

pdist works with rows (not columns), which is what you need.
pdist reduces operations by exploiting the properties that any distance function must have. Namely, the distance of an element to itself is known to be zero; and the distance for each pair can be computed just once thanks to symmetry. If you want to arrange the output in the form of a matrix, use squareform.

So, after your actual distance function has been modified appropriately (which may be the hard part), use:

distances = squareform(pdist(segments, @calc_dist));

For example:

N = 4;
segments = rand(N,6);
distances = squareform(pdist(segments, @calc_dist));

produces

distances =
         0    6.1492    7.0886    5.5016
    6.1492         0    6.8559    5.2688
    7.0886    6.8559         0    6.2082
    5.5016    5.2688    6.2082         0

回答2:

Unfortunately I don't see any "smarter" (i.e. read faster) solution than the double loop. For speed consideration I'd organize the points as a 6×N array, not the other way, because column access is way faster than row access in MATLAB.

So:

N = 150000;
Segments = rand(6, N);
Pairwise_Distance_Matrix = Inf(N, N);
for i = 1:(N-1)
        for j = (i+1):N
                Pairwise_Distance_Matrix(i,j) = calc_dist(Segments(:,i), Segments(:,j));
        end;
end;

Minimum_Pairwise_Distance = min(min(Pairwise_Distance_Matrix));

Contrary to common wisdom, explicit loops are faster now in MATLAB compared to the likes of arrayfun, cellfun or structfun; bsxfun beats everything else in terms of speed, but it doesn't apply to your case.

来源：https://stackoverflow.com/questions/26218334/matlab-bsxfun-unclear-want-to-accelerate-minimum-distance-between-segments

标签

matlab

bsxfun