I recently updated my .f90 code to .f03, and I was expecting to see speedup because my older version involved many allocating and deallocating (7 3D arrays--45x45x45) at eac
There are two questions here:
An important thing to say about the first question is, you may be best testing things for yourself: there are a lot of specific aspects to this.
However, I have quickly knocked something up which may guide you.
module test
implicit none
type t1
real, allocatable :: x(:)
end type t1
contains
function div_fun(f,g) result(q)
type(t1), intent(in) :: f, g
type(t1) q
q%x = f%x/g%x
end function div_fun
subroutine div_sub1(f, g, q)
type(t1), intent(in) :: f, g
type(t1), intent(out) :: q
q%x = f%x/g%x
end subroutine div_sub1
subroutine div_sub2(f, g, q)
type(t1), intent(in) :: f, g
type(t1), intent(inout) :: q
q%x(:) = f%x/g%x
end subroutine div_sub2
end module test
With this, I observed that sometimes there was no significant difference between using a function and a subroutine, and sometimes there was. That is, it depends on compilers, flags, and so on.
However, it is important to note what is happening.
For the function the result requires an allocation, and for the subroutine div_sub1
the intent(out)
argument requires an allocation. [Assigning the function result adds to things - see later.]
In div_sub2
the allocation is re-used (the "result" argument is intent(inout)
) and we suppress an automatic re-allocation by using q%x(:)
. It's this latter part that is important: compilers often suffer overheads for doing the checking about whether resizing is required. This latter part can be tested by changing the intent of q
in div_sub1
to inout
.
[Note that, there is an assumption for this div_sub2
approach that sizes aren't changing; this seems supported by your text.]
To conclude for the first question: check for yourself, but wonder whether you are merely "hiding" allocations by using derived types rather than removing them. And you may get very different answers using parameterized derived types.
Coming to the second question, why are functions commonly used? You'll note that I've looked at very specific cases:
q = div_fun(f,g)
call div_sub2(f,g,q) ! Could be much faster
From the question text and the links (and previous questions you've asked) I'll assume that you have something which overloads the /
operator
interface operator (/)
module procedure div_fun
end interface
allowing
q = f/g ! Could be slower, but looks good.
call div_sub2(f,g,q)
as we note that to be used as a binary operator (see Fortran 2008 7.1.5, 7.1.6) the procedure must be a function. In response to your comment to a previous revision of this answer
aren't the div_sub1 and div_sub2 binary operators just like the div_fun?
the answer is "no", at least in terms of what Fortran defines as being binary operators (link as above). [Also, div_fun
isn't itself a binary operator, it's the combination of the function and the generic interface that form the operation.]
Making the function approach attractive is that a binary operation can be part of an expression:
q = q + alpha*(f/g) ! Very neat
call div_sub2(f,g,temp1)
call mult_sub(alpha, temp1, temp2)
call add_sub(q, temp2, temp3)
call assign_sub(q, temp3)
Using subroutines can get a little messy. The example above could be tidied slightly by handling "in-place" aspects (or specialist subroutines), but this brings me to a final point. Because a function result is evaluated entirely before its later use (including assignment) we have situations like
f = f/g ! or f=div_fun(f,g)
call div_sub2(f,g,f) ! Beware aliasing
To conclude for the second question: performance isn't everything.
[Finally, if you mean you are using .f90
and .f03
file suffixes to denote/govern standard compliance, then you may want to see what people think about that.]