I\'ve got a numpy
script that spends about 50% of its runtime in the following code:
s = numpy.dot(v1, v1)
where
v1 = v[1
Your arrays are not very big, so ATLAS probably isn't doing much. What are your timings for the following Fortran program? Assuming ATLAS isn't doing much, this should give you a sense of how fast dot() could be if there was not any python overhead. With gfortran -O3 I get speeds of 5 +/- 0.5 us.
program test
real*8 :: x(4000), start, finish, s
integer :: i, j
integer,parameter :: jmax = 100000
x(:) = 4.65
s = 0.
call cpu_time(start)
do j=1,jmax
s = s + dot_product(x, x)
enddo
call cpu_time(finish)
print *, (finish-start)/jmax * 1.e6, s
end program test