Say I have a virtual function call foo() on an abstract base class pointer, mypointer->foo(). When my app starts up, based on the contents of a file, it chooses to instantiate a
So, what you basically want to do is convert runtime polymorphism into compile time polymorphism. Now you still need to build your app so that it can handle multiple "cases", but once it's decided which case is applicable to a run, that's it for the duration.
Here's a model of the runtime polymorphism case:
struct Base {
virtual void doit(int&)=0;
};
struct Foo : public Base {
virtual void doit(int& n) {--n;}
};
struct Bar : public Base {
virtual void doit(int& n) {++n;}
};
void work(Base* it,int& n) {
for (unsigned int i=0;i<4000000000u;i++) it->doit(n);
}
int main(int argc,char**) {
int n=0;
if (argc>1)
work(new Foo,n);
else
work(new Bar,n);
return n;
}
This takes ~14s to execute on my Core2, compiled with gcc 4.3.2 (32 bit Debian), -O3
option.
Now suppose we replace the "work" version with a templated version (templated on the concrete type it's going to be working on):
template void work(T* it,int& n) {
for (unsigned int i=0;i<4000000000u;i++) it->T::doit(n);
}
main
doesn't actually need to be updated, but note that the 2 calls to work
now trigger instantiations of and calls to two different and type-specific functions (c.f the one polymorphic function previously).
Hey presto runs in 0.001s. Not a bad speed up factor for a 2 line change! However, note that the massive speed up is entirely due to the compiler, once the possibility of runtime polymorphism in the work
function is eliminated, just optimizing away the loop and compiling the result directly into the code. But that actually makes an important point: in my experience the main gains from using this sort of trick come from the opportunities for improved inlining and optimisation they allow the compiler when a less-polymorphic, more specific function is generated, not from the mere removal of vtable indirection (which really is very cheap).
But I really don't recommend doing stuff like this unless profiling absolutely indicates runtime polymorphism is really hitting your performance. It'll also bite you as soon as someone subclasses Foo
or Bar
and tries to pass that into a function actually intended for its base.
You might find this related question interesting too.