I want to parallelize block2 for each block1 and parallerlize outer loop too.
previous code:
for i=rangei
parfor
cannot be nested. In nested parfor
statements, only the outermost call to parfor
is paralellized, which means that the inner call to parfor
only adds unnecessary overhead.
To get high efficiency with parfor
, the number of iterations should be much higher than the number of workers (or an exact multiple in case each iteration takes the same time), and you want a single iteration to take more than just a few milliseconds to avoid feeling the overhead from paralellization.
parfor i=rangei
for j=rangej
dependent on
end
end
may actually fit that description, depending on the size of rangei
. Alternatively, you may want to try unrolling the nested loop into a single loop, where you iterate across linear indices.