OpenMP parallel numerical integration (summation) performance

问题

I recently started studying parallel coding, I'm still at the beginning so I wanted to try some very simple coding. Since it is in my interest to perform parallel numerical integration I started with a simple summation Fortran code:

program par_hello_world

use omp_lib

implicit none

integer, parameter:: bign = 1000000000
integer:: i

double precision:: start, finish, start1, finish1, a

a = 0

call cpu_time(start)

!$OMP PARALLEL num_threads(8)

  !$OMP DO REDUCTION(+:a)

    do i = 1,bign


      a = a + sqrt(1.0**5)

    end do

  !$OMP END DO

!$OMP END PARALLEL

call cpu_time(finish)

print*, 'parallel result:'

print*, a

print*, (finish-start)


a=0

call cpu_time(start1)

do i = 1,bign

  a = a + sqrt(1.0**5)

end do

call cpu_time(finish1)

print*, 'sequential result:'

print*, a

print*, (finish1-start1)

end program

The code basically simulates a summation, I used the weird expression sqrt(1.0**5) to have a measurable computational time, if I used just 1 the computational time was so small that i could not compare the sequential code with the parallel. I tried to avoid the race condition by using the REDUCTION clause.

However I'm getting very strange time results:

If I raise the number of threads from 2 to 16 I don't get a reduction of computational time but somehow I even get an increase.
Incredibly it seems that also the sequential code is influenced by the choice of the threads number (I really don't understand why!) in particular it is raised if I raise the number of threads.
I get the correct result for the variable a

I think I'm doing something very wrong somewhere, but I'm clueless about it...

来源：https://stackoverflow.com/questions/27332619/openmp-parallel-numerical-integration-summation-performance

标签

multithreading

performance

parallel-processing

fortran

numeric