问题
I tried to measure the difference of different matrix-vector-multiplication schemes in Fortran. I have actually written the following code: http://pastebin.com/dmKXdnX6
The 'optimized version' is meant to respect the memory layout of the matrix, by swapping the loops to access the matrix-elements. The provided code should compile with gfortran and it runs with the following rather surprising results:
Vectors match! Calculations are OK.
Optimized time: 0.34133333333333332
Naive time: 1.4133333333333331E-002
Ratio (t_optimized/t_naive): 24.150943396226417
I've probably made an embarrassing mistake, but I'm unable to spot it. I hope someone else can help me.
I know that there are optimized versions provided by fortran, but I'm measuring this just out of curiosity.
Thanks in Advance.
回答1:
Well, it's a simple matter of paranthesis:
t_optimized = t2-t1/iterations
is most certainly wrong... You probably mean
t_optimized = (t2-t1)/iterations
With that I get a speed-up of ~2.
A couple of other things I needed to correct/adjust:
- The first loop is wrong, you are trying to set elements out of there boundaries. It should read:
A(j,i) = (-1.0)**(i-j)
- Modern compilers are quite intelligent. They probably notice that you do not vary the input of your function call within the loop body. They can then optimize away your whole loop! To prevent that, I inserted the following line:
do i = 1,iterations
call optimized(A, m, n, x, y1)
x(1:n) = y1
end do
(and the same for y2
). Don't forget to re-initialize x
at the beginning of each benchmark.
- Don't use
;
that much - it is not required unless you want to put multiple statements on one line - Don't use tabs in Fortran - some compilers don't like it - use whitespaces instead
来源:https://stackoverflow.com/questions/20904070/fortran-matrix-vector-multiplication-optimization