Template trick to optimize out allocations

前端 未结 3 699
南旧
南旧 2021-02-03 11:20

I have:

struct DoubleVec {
  std::vector data;
};

DoubleVec operator+(const DoubleVec& lhs, const DoubleVec& rhs) {
  DoubleVec ans(lhs.si         


        
相关标签:
3条回答
  • 2021-02-03 11:37

    Have a look at Boost.Proto, which is a library for writing EDSL (embedded domain specific languages) directly in C++. There is even an example showing exactly what you need.

    0 讨论(0)
  • 2021-02-03 11:38

    http://codeidol.com/cpp/cpp-template-metaprogramming/Domain-Specific-Embedded-Languages/-10.5.-Blitz-and-Expression-Templates/

    If we had to boil the problem solved by Blitz++ down to a single sentence, we'd say, "A naive implementation of array math is horribly inefficient for any interesting computation." To see what we mean, take the boring statement

    x = a + b + c;
    

    The problem here is that the operator+ signature above is just too greedy: It tries to evaluate a + b just as soon as it can, rather than waiting until the whole expression, including the addition of c, is available.

    In the expression's parse tree, evaluation starts at the leaves and proceeds upwards to the root. What's needed here is some way of delaying evaluation until the library has all of the expression's parts: that is, until the assignment operator is executed. The stratagem taken by Blitz++ is to build a replica of the compiler's parse tree for the whole expression, allowing it to manage evaluation from the top down

    This can't be any ordinary parse tree, though: Since array expressions may involve other operations like multiplication, which require their own evaluation strategies, and since expressions can be arbitrarily large and nested, a parse tree built with nodes and pointers would have to be traversed at runtime by the Blitz++ evaluation engine to discover its structure, thereby limiting performance. Furthermore, Blitz++ would have to use some kind of runtime dispatching to handle the different combinations of operation types, again limiting performance.

    Instead, Blitz++ builds a compile-time parse tree out of expression templates. Here's how it works in a nutshell: Instead of returning a newly computed Array, operators just package up references to their arguments in an Expression instance, labeled with the operation:

    // operation tags
    struct plus; struct minus;
    
    // expression tree node
    template <class L, class OpTag, class R>
    struct Expression
    {
        Expression(L const& l, R const& r)
          : l(l), r(r) {}
    
        float operator[](unsigned index) const;
    
        L const& l;
        R const& r;
    };
    
    // addition operator
    template <class L, class R>
    Expression<L,plus,R> operator+(L const& l, R const& r)
    {
        return Expression<L,plus,R>(l, r);
    }
    

    Notice that when we write a + b, we still have all the information needed to do the computationit's encoded in the type Expressionand the data is accessible through the expression's stored references. When we write a + b + c, we get a result of type:

    Expression<Expression<Array,plus,Array>,plus,Array>
    
    0 讨论(0)
  • 2021-02-03 11:52

    Yep, that's exactly what expression templates (see http://www.drdobbs.com/184401627 or http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Expression-template for example) are for.

    The idea is to make operator+ return some kind of proxy object which represents the expression tree to be evaluated. Then operator= is written to take such an expression tree and evaluate it all at once, avoiding the creation of temporaries, and applying any other optimizations that may be applicable.

    0 讨论(0)
提交回复
热议问题