Well-known solution for avoiding the slowness of dynamic_cast?

问题

I needed run-time polymorphism, so I used dynamic_cast.
But now I had two problems -- dynamic_cast was extremely slow! (Scroll down for benchmark.)

Long story short, I ended up solving the problem this way, using static_cast:

struct Base
{
    virtual ~Base() { }
    virtual int type_id() const = 0;

    template<class T>
    T *as()
    { return this->type_id() == T::ID ? static_cast<T *>(this) : 0; }

    template<class T>
    T const *as() const
    { return this->type_id() == T::ID ? static_cast<T const *>(this) : 0; }
};

struct Derived : public Base
{
    enum { ID = __COUNTER__ };  // warning: can cause ABI incompatibility
    int type_id() const { return ID; }
};

int main()
{
    Base const &value = Derived();
    Derived const *p = value.as<Derived>();  // "static" dynamic_cast
}

But I'm surely not the first person to run into this problem, so I thought it might be worth asking:

Instead of coming up with a home-baked solution like this, is there a well-known pattern/library I can use for solving this problem in the future?

Sample benchmark

To get a feel for what I'm talking about, try the code below -- dynamic_cast was approximately 15 times slower than a mere virtual call on my machine (110 ms. against 1620 ms. with the code below):

#include <cstdio>
#include <ctime>

struct Base { virtual unsigned vcalc(unsigned i) const { return i * i + 1; } };
struct Derived1 : public Base 
{ unsigned vcalc(unsigned i) const { return i * i + 2; } };
struct Derived2 : public Derived1
{ unsigned vcalc(unsigned i) const { return i * i + 3; } };

int main()
{
    Base const &foo = Derived2();
    size_t const COUNT = 50000000;
    {
        clock_t start = clock();
        unsigned n = 0;
        for (size_t i = 0; i < COUNT; i++)
            n = foo.vcalc(n);
        printf("virtual call: %d ms (result: %u)\n",
            (int)((clock() - start) * 1000 / CLOCKS_PER_SEC), n);
        fflush(stdout);
    }
    {
        clock_t start = clock();
        unsigned n = 0;
        for (size_t i = 0; i < COUNT; i++)
            n = dynamic_cast<Derived1 const &>(foo).vcalc(n);
        printf("virtual call after dynamic_cast: %d ms (result: %u)\n",
            (int)((clock() - start) * 1000 / CLOCKS_PER_SEC), n);
        fflush(stdout);
    }
    return 0;
}

When I simply remove the word virtual and change dynamic_cast to static_cast, I get a 79 ms running time -- so a virtual call is only slower than a static call by ~25%!

回答1:

Most uses of dynamic_cast can be replaced with double dispatch (aka. visitor pattern). That would amount to two virtual calls, which by your benchmark is still 7.5 times faster than dynamic_cast.

回答2:

You may be interested in this constant time implementation: http://www.stroustrup.com/isorc2008.pdf

Also note that many upcasts may be simplified under specific constraints -- for example, if you do not use multiple inheritance, use only shallow inheritance, or otherwise guarantee that the type will have no common ancestors, some evaluations can be short circuited and an exhaustive evaluation (as provided by dynamic_cast) need not be performed.

As usual, profile your implementation against your vendor's implementation for your given use case and actual class hierarchies.

来源：https://stackoverflow.com/questions/12344394/well-known-solution-for-avoiding-the-slowness-of-dynamic-cast

标签

c++

performance

dynamic-cast