A fortran timing issue I cannot understand

℡╲_俬逩灬. 提交于 2020-02-12 05:01:28

问题


I wrote (for my class in Numerical Methods for Theoretical Physics) a very simple program for a Random Walk in dimension 2. Here it is:

program random_walk

implicit none

integer, parameter :: Nwalker = 1000000
integer, parameter :: Nstep   = 100
integer, parameter :: Nmeas   = 10

integer :: posx, posy, move

integer :: is, im, iw
real    :: start_time, stop_time

double precision, dimension(Nmeas) :: dist, r2
real :: rnd

do im = 1, Nmeas
    dist(im) = im*Nstep
    r2(im)   = 0.0
end do

call cpu_time(start_time)
do iw = 1, Nwalker
    posx = 0
    posy = 0
    do im = 1, Nmeas
        do is = 1, Nstep
            call random_number(rnd)
            move = 4*rnd
            if (move == 0) posx = posx + 1
            if (move == 1) posy = posy + 1
            if (move == 2) posx = posx - 1
            if (move == 3) posy = posy - 1
        end do
        r2(im) = r2(im) + posx**2 + posy**2
    end do
end do
r2 = r2 / Nwalker
call cpu_time(stop_time)
do im = 1, Nmeas
    print '(f8.6, "   ", f8.6)', log(dist(im)), log(r2(im))
end do
print '("Time = ", f6.3, " seconds")', stop_time - start_time
end program

In the end it should print 10 rows 2 columns: first column is the logarithm of "time" (number of steps), second column is the logarithm of the average squared distance from the origin. The second column "on average" should be equal to the first. So far so good, the program is working well, results are very reasonable. But here the problem; on my macbookpro (2,7 GHz Intel Core i7, compiler gfortran 7.1.0, optimization -O2) it tooks on average more than 20 seconds to run. But if I comment out these lines:

! do im = 1, Nmeas
!    print '(f8.6, "   ", f8.6)', log(dist(im)), log(r2(im))
! end do

which are beyond of "stop_time" computation, the result is that the running time is... less than 6 seconds!?

How is it possible?


回答1:


This is quite a typical thing to observe. People hit this when they create artificial computations which only test performance and do not create a useful result. When the result is not printed, the compiler can recognize that it does not need the result for the program output and may completely omit the computation.

To examine it you can add the -fdump-tree-optimized flag to get a special source form called GIMPLE and you can compare the output for those two variants of the source code. It writes the output to a file called yourfilename.f90.something.optimized. I can indeed see a big part missing. Basically the whole r2 array and the operations with it are optimized out. You can also compare the generated assembly if you know that better.



来源:https://stackoverflow.com/questions/47583729/a-fortran-timing-issue-i-cannot-understand

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!