So I have been having a look at some of the magic that is O3
in GCC (well actually I\'m compiling using Clang but it\'s the same with GCC and I\'m guessing a large
If you compile with gcc -O3 -fdump-tree-all
, you can see that the first dump in which the recursion has been turned into a loop is foo.c.035t.tailr1
. This means the same optimisation that handles other tail calls also handles this slightly extended case. Recursion in the form of n * foo(...)
or n + foo(...)
is not that hard to handle manually (see below), and since it's possible to describe exactly how, the compiler can perform that optimisation automatically.
The optimisation of main
is much simpler: inlining can turn this into 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1 * 1
, and if all the operands of a multiplication are constants, then the multiplication can be performed at compile time.
Update: Here's how you can manually remove the recursion from foo
, which can be done automatically. I'm not saying this is the method used by GCC, but it's one realistic possibility.
First, create a helper function. It behaves exactly as foo(n)
, except that its results are multiplied by an extra parameter f
.
int foo(int n)
{
return foo_helper(n, 1);
}
int foo_helper(int n, int f)
{
if (n == 0) return f * 1;
return f * n * foo(n-1);
}
Then, turn recursive calls of foo
into recursive calls of foo_helper
, and rely on the factor parameter to get rid of the multiplication.
int foo(int n)
{
return foo_helper(n, 1);
}
int foo_helper(int n, int f)
{
if (n == 0) return f;
return foo_helper(n-1, f * n);
}
Turn this into a loop:
int foo(int n)
{
return foo_helper(n, 1);
}
int foo_helper(int n, int f)
{
restart:
if (n == 0) return f;
{
int newn = n-1;
int newf = f * n;
n = newn;
f = newf;
goto restart;
}
}
Finally, inline foo_helper
:
int foo(int n)
{
int f = 1;
restart:
if (n == 0) return f;
{
int newn = n-1;
int newf = f * n;
n = newn;
f = newf;
goto restart;
}
}
(Naturally, this is not the most sensible way to manually write the function.)