I am a newbie in programming with OpenMp. I wrote a simple c program to multiply matrix with a vector. Unfortunately, by comparing executing time I found that the OpenMP is much
Because when OpenMP distributes the work among threads there is a lot of administration/synchronisation going on to ensure the values in your shared matrix and vector are not corrupted somehow. Even though they are read-only: humans see that easily, your compiler may not.
Things to try out for pedagogic reasons:
0) What happens if matrix
and vector
are not shared
?
1) Parallelize the inner "j-loop" first, keep the outer "i-loop" serial. See what happens.
2) Do not collect the sum in result[i]
, but in a variable temp
and assign its contents to result[i]
only after the inner loop is finished to avoid repeated index lookups. Don't forget to init temp
to 0 before the inner loop starts.