I\'ve lately encountered a lot of functions where gcc generates really bad code on x86. They all fit a pattern of:
if (some_condition) {
/* do something real
I would probably refactor the code to encourage inlining of the simple case. That said, you can use -finline-limit
to make gcc
consider inlining larger functions, or -fomit-frame-pointer -fno-exceptions
to minimize the stack frame. (Note that the latter may break debugging and cause C++ exceptions to misbehave badly.)
Probably you won't be able to get much from tweaking compiler options, though, and will have to refactor.
Update
To explicitely suppress inlining for a single function in gcc, use:
void foo() __attribute__ ((noinline))
{
...
}
See also How can I tell gcc not to inline a function?
Functions like this will regularly be inlined automatically unless compiled -O0 (disable optimization).
In C++ you can hint the compiler using the inline keyword
If the compiler won't take your hint you are probably using too many registers/branches inside the function. The situation is almost certainly resolved by extracting the 'complicated' block into it's own function.
Update i noticed you added the fact that they are extern symbols. (Please update the question with that crucial info). Well, in a sense, with external functions, all bets are off. I cannot really believe that gcc will by definition inline all of a complex function into a tiny caller simply because it is only called from there. Perhaps you can give some sample code that demonstrates the behaviour and we can find the proper optimization flags to remedy that?
Also, is this C or C++? In C++ I know it is common place to include the trivial decision functions inline (mostly as members defined in the class declaration). This won't give a linkage conflict like with simple (extern) C functions.
Also you can have template functions defined that will inline perfectly in all compilation modules without resulting in link conflicts.
I hope you are using C++ because it will give you a ton of options here.
Seeing as these are external calls, it might be possible the gcc is treating them as unsafe and preserving registers for the function call(hard to know without seeing the registers that it preserves, including the ones you say 'aren't used'). Out of curiousity, does this excessive register spilling still occur with all optimizations disabled?
Perhaps upgrade your version of gcc? 4.6 has just been released. As far as I understand, it has the possibility of "partial inline". That is, an easily integratable outer part of a function is inlined and the expensive part is transformed into a call. But I have to admit that I didn't try it myself, yet.
Edit: The statement I was referring to from the ChangeLog:
Partial inlining is now supported and enabled by default at -O2 and greater. The feature can be controlled via -fpartial-inlining.
Partial inlining splits functions with short hot path to return. This allows more aggressive inlining of the hot path leading to better performance and often to code size reductions (because cold parts of functions are not duplicated).
...
Inlining when optimizing for size (either in cold regions of a program or when compiling with -Os) was improved to better handle C++ programs with larger abstraction penalty, leading to smaller and faster code.
I would do it like this:
static void complex_function() {}
void foo()
{
if(simple_case) {
// do whatever
return;
} else {
complex_function();
}
}
The compiler my insist on inlining complex_function(), in which case you can use the noinline attribute on it.