When does vectorization is a better or worse solution than a loop? [duplicate]

In Matlab, I am trying to vectorise my code to improve the simulation time. However, the result I got was that I deteriorated the overall efficiency.

To understand the phenomenon I created 3 distinct functions that does the same thing but with different approach :

The main file :

clc,
clear,

n = 10000;
Value = cumsum(ones(1,n));

NbLoop = 10000;

time01 = zeros(1,NbLoop);
time02 = zeros(1,NbLoop);
time03 = zeros(1,NbLoop);

for test = 1 : NbLoop

    tic
    vector1 =  function01(n,Value);
    time01(test) = toc ;

    tic
    vector2 =  function02(n,Value);
    time02(test) = toc ;

    tic
    vector3 =  function03(n,Value);
    time03(test) = toc ; 

end

figure(1)
hold on

plot( time01, 'b')
plot( time02, 'g')
plot( time03, 'r')

The function 01:

function vector =  function01(n,Value)

vector = zeros( 2*n,1);
for k = 1:n
    vector(2*k -1) =  Value(k);
    vector(2*k) =  Value(k);
end

end

The function 02:

function vector =  function02(n,Value)

vector = zeros( 2*n,1);
vector(1:2:2*n) = Value; 
vector(2:2:2*n) = Value; 

end

The function 03:

function vector =  function03(n,Value)

MatrixTmp = transpose([Value(:), Value(:)]);
vector = MatrixTmp (:);

end

The blue plot correspond to the for - loop.

n = 100:

n = 10000:

When I run the code with n = 100, the more efficient solution is the first function with the for loop. When n = 10000 The first function become the less efficient.

Do you have a way to know how and when to properly replace a for-loop by a vectorised counterpart?
What is the impact of index searching with array of tremendous dimensions ?
Does Matlab compute in a different manner an array of dimensions 3 or higher than a array of dimension 1 or 2?
Is there a clever way to replace a while loop that use the result of an iteration for the next iteration?

Using MATLAB Online I see something different:

n            10000       100
function01   5.6248e-05  2.2246e-06
function02   1.7748e-05  1.9491e-06
function03   2.7748e-05  1.2278e-06
function04   1.1056e-05  7.3390e-07  (my version, see below)

Thus, the loop version is always slowest. Method #2 is faster for very large matrices, Method #3 is faster for very small matrices.

The reason is that method #3 makes 2 copies of the data (transpose or a matrix incurs a copy), and that is bad if there's a lot of data. Method #2 uses indexing, which is expensive, but not as expensive as copying lots of data twice.

I would suggest this function instead (Method #4), which transposes only vectors (which is essentially free). It is a simple modification of your Method #3:

function vector = function04(n,Value)
vector = [Value(:).'; Value(:).'];
vector = vector(:);
end

Do you have a way to know how and when to properly replace a for-loop by a vectorised counterpart?

In general, vectorized code is always faster if there are no large intermediate matrices. For small data you can vectorize more aggressively, for large data sometimes loops are more efficient because of the reduced memory pressure. It depends on what is needed for vectorization.

What is the impact of index searching with array of tremendous dimensions?

This refers to operations such as d = data(data==0). Much like everything else, this is efficient for small data and less so for large data, because data==0 is an intermediate array of the same size as data.

Does Matlab compute in a different manner an array of dimensions 3 or higher than a array of dimension 1 or 2?

No, not in general. Functions such as sum are implemented in a dimensionality-independent way^{citation needed}.

Is there a clever way to replace a while loop that use the result of an iteration for the next iteration?

It depends very much on what the operations are. Functions such as cumsum can often be used to vectorize this type of code, but not always.

This is my timing code, I hope it shows how to properly use timeit:

%n = 10000;
n = 100;
Value = cumsum(ones(1,n));

vector1 = function01(n,Value);
vector2 = function02(n,Value);
vector3 = function03(n,Value);
vector4 = function04(n,Value);
assert(isequal(vector1,vector2))
assert(isequal(vector1,vector3))
assert(isequal(vector1,vector4))

timeit(@()function01(n,Value))
timeit(@()function02(n,Value))
timeit(@()function03(n,Value))
timeit(@()function04(n,Value))

function vector = function01(n,Value)
vector = zeros(2*n,1);
for k = 1:n
    vector(2*k-1) = Value(k);
    vector(2*k) = Value(k);
end
end

function vector = function02(n,Value)
vector = zeros(2*n,1);
vector(1:2:2*n) = Value; 
vector(2:2:2*n) = Value; 
end

function vector = function03(n,Value)
MatrixTmp = transpose([Value(:), Value(:)]);
vector = MatrixTmp(:);
end

function vector = function04(n,Value)
vector = [Value(:).'; Value(:).'];
vector = vector(:);
end

来源：https://stackoverflow.com/questions/57729888/when-does-vectorization-is-a-better-or-worse-solution-than-a-loop

标签

matlab

performance

time

vectorization

coding-efficiency