What are the actual runtime performance costs of dynamic dispatch?

后端 未结 1 1019
盖世英雄少女心
盖世英雄少女心 2020-12-09 19:35

There\'s some background on this topic in the Rust book section on static and dynamic dispatch, but the tl;dr is that calling a method on a trait reference and a few other v

相关标签:
1条回答
  • 2020-12-09 20:10

    Disclaimer: the question is fairly open-ended, therefore this answer might be incomplete. Treat it with an even bigger grain of salt than you would normally.

    Rust uses a simple "virtual table" to implement dynamic dispatch. This strategy is also used in C++ for which you can see a study here. The study is a bit dated though.

    The cost of indirection

    Virtual dispatch induces indirection, this has a cost for multiple reasons:

    • indirection is opaque: this inhibits inlining and constant propagation which are key enablers for many compiler optimizations
    • indirection has a runtime cost: if incorrectly predicted, you are looking at pipeline stalls and expensive memory fetches

    Optimizing indirection

    Compilers, however, muddies the water by trying their best at optimizing indirection away.

    • devirtualization: sometimes the compiler can resolve the virtual table look-up at compile time (usually, because it knows the concrete type of the object); if so, it can therefore use a regular function call rather than an indirect one, and optimize away the indirection
    • probabilistic devirtualization: last year Honza Hubička introduced a new optimization in gcc (read the 5-part series as it is very instructive). The gist of the strategy is to build the inheritance graph to make an educated guess at the potential type(s), and then use a pattern like if v.hasType(A) { v.A::call() } elif v.hasType(B) { v.B::call() } else { v.virtual-call() }; special-casing the most likely types means regular calls in this case, and therefore inlined/constant-propagated/full-goodies calls.

    This latter strategy could be fairly interesting in Rust due to the coherence rules and privacy rules, as it should have more cases where the full "inheritance" graph is provably known.

    The cost of monomorphization

    In Rust, you can use compile-time polymorphism instead of run-time polymorphism; the compiler will emit one version of the function for each unique combination of compile-time parameters it takes. This, itself, has a cost:

    • compilation time cost: more code to be produced, more code to be optimized
    • binary size cost: the produced binary will end-up being larger, a typical size/speed trade-off
    • run-time cost: possibly, the larger code size might lead to cache misses at CPU level

    The compiler might be able to merge together specialized functions that end up having the same implementation (because of phantom types, for example), however it is still more than likely than the produced binaries (executables and libraries) will end up larger.

    As usual with performance, you have to measure in your case what is more beneficial.

    0 讨论(0)
提交回复
热议问题