I searched Stack Overflow for the pros/cons of function-like macros v. inline functions.
I found the following discussion: Pros and Cons of Different macro function
It's the calls to pow() you want to eliminate. This function takes general floating point exponents and is inefficient for raising to integral exponents. Replacing these calls with e.g.
inline double cube(double x)
{
return x * x * x;
}
is the only thing which will make a significant difference to your performance here.
The best way to answer your question is to benchmark both approaches to see which is actually faster in your application, using your test data. Predictions about performance are notoriously unreliable except at the coarsest levels.
That said, I would expect there to be no significant difference between a macro and a truly inlined function call. In both cases, you should end up with the same assembly code under the hood.
Macros, including function-like macros, are simple text substitutions, and as such can bite you in the ass if you're not really careful with your parameters. For example, the ever-so-popular SQUARE macro:
#define SQUARE(x) ((x)*(x))
can be a disaster waiting to happen if you call it as SQUARE(i++)
. Also, function-like macros have no concept of scope, and don't support local variables; the most popular hack is something like
#define MACRO(S,R,E,C) \
do \
{ \
double AttractiveTerm = pow((S)/(R),3); \
double RepulsiveTerm = AttractiveTerm * AttractiveTerm; \
(C) = 4 * (E) * (RepulsiveTerm - AttractiveTerm); \
} while(0)
which, of course, makes it hard to assign a result like x = MACRO(a,b);
.
The best bet from a correctness and maintainability standpoint is to make it a function and specify inline
. Macros are not functions, and should not be confused with them.
Once you've done that, measure the performance and find where any actual bottleneck is before hacking at it (the call to pow
would certainly be a candidate for streamlining).
As I understand it from some guys who write compilers, once you call a function from inside it is not very likely your code will be inlined anyway. But, that is why you should not use a macro. Macros remove information and leave the compiler with far fewer options to optimize. With multi-pass compilers and whole program optimizations they will know that inlining your code will cause a failed branch prediction or a cache miss or other black magic forces modern CPUs use to go fast. I think everyone is right to point out that the code above is not optimal anyway, so that is where the focus should be.
If you random-pause this, what you're probably going to see is that 100% (minus epsilon) of the time is inside the pow
function, so how it got there makes basically no difference.
Assuming you find that, the first thing to do is get rid of the calls to pow
that you found on the stack.
(In general, what it does is take the log
of the first argument, multiply it by the second argument, and exp
of that, or something that does the same thing. The log
and exp
could well be done by some kind of series involving a lot of arithmetic. It looks for special cases, of course, but it's still going to take longer than you would.)
That alone should give you around an order of magnitude speedup.
Then do the random-pausing again. Now you're going to see something else taking a lot of the time. I can't guess what it will be, and neither can anyone else, but you can probably reduce that too. Just keep doing it until you can't any more.
It may happen along the way that you choose to use a macro, and it might be slightly faster than an inline function. That's for you to judge when you get there.
An macro is not really a function. whatever you define as a macro gets verbatim posted into your code, before the compiler gets to see it, by the preprocessor. The preprocessor is just a software engineers tool that enables various abstractions to better structure your code.
A function inline or otherwise the compiler does know about, and can make decisions on what to do with it. A user supplined inline
keyword is just a suggestion and the compiler may over-ride it. It is this over-riding that in most cases would result in better code.
Another side effect of the compiler being aware of the functions is that you could potentially force the compiler to take certain decisions -for example, disabling inlining of your code, which could enable you to better debug or profile your code. There are probably many other use-cases that inline functions enable vs. macros.
Macros are extremely powerful though, and to back this up I would cite google test and google mock. There are many reasons to use macros :D.
Simple mathmatical operations that are chained together using functions are often inlined by the compiler, especially if the function is only called once in the translation step. So, I wouldn't be surprised that the compiler takes inlining decisions for you, regardless of weather the keyword is supplied or not.
However, if the compiler doesn't you can manually flatted out segments of your code. If you do flatten it out perhaps macros will serve as a good abstraction, after all they present similar semantics to a "real" function.
The Crux
So, do you want the compiler to be aware of certain logical boundaries so it can produce better physical code, or do you want force decisions on the compiler by flattening it out manually or by using macros. The industry leans towards the former.
I would lean towards using macros in this case, just because it's quick and dirty, without having to learn much more. However, as macros are a software engineering abstraction, and because you are concerned with the code the compiler generates, if the problem were to become slightly more advanced I would use C++ templates, as they were designed for the concerns you are pondering.