Are the days of passing const std::string & as a parameter over?

后端 未结 13 1809

I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gon

相关标签:
13条回答
  • std::string is not Plain Old Data(POD), and its raw size is not the most relevant thing ever. For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage.

    The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. If you don't, a const reference could still be the better way to go.

    0 讨论(0)
  • 2020-11-22 17:20

    Are the days of passing const std::string & as a parameter over?

    No. Many people take this advice (including Dave Abrahams) beyond the domain it applies to, and simplify it to apply to all std::string parameters -- Always passing std::string by value is not a "best practice" for any and all arbitrary parameters and applications because the optimizations these talks/articles focus on apply only to a restricted set of cases.

    If you're returning a value, mutating the parameter, or taking the value, then passing by value could save expensive copying and offer syntactical convenience.

    As ever, passing by const reference saves much copying when you don't need a copy.

    Now to the specific example:

    However inval is still quite a lot larger than the size of a reference (which is usually implemented as a pointer). This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. So it seems to me that passing by reference is still a good idea. Can anyone explain why Herb might have said this?

    If stack size is a concern (and assuming this is not inlined/optimized), return_val + inval > return_val -- IOW, peak stack usage can be reduced by passing by value here (note: oversimplification of ABIs). Meanwhile, passing by const reference can disable the optimizations. The primary reason here is not to avoid stack growth, but to ensure the optimization can be performed where it is applicable.

    The days of passing by const reference aren't over -- the rules just more complicated than they once were. If performance is important, you'll be wise to consider how you pass these types, based on the details you use in your implementations.

    0 讨论(0)
  • 2020-11-22 17:26

    As @JDługosz points out in the comments, Herb gives other advice in another (later?) talk, see roughly from here: https://youtu.be/xnqTKD8uD64?t=54m50s.

    His advice boils down to only using value parameters for a function f that takes so-called sink arguments, assuming you will move construct from these sink arguments.

    This general approach only adds the overhead of a move constructor for both lvalue and rvalue arguments compared to an optimal implementation of f tailored to lvalue and rvalue arguments respectively. To see why this is the case, suppose f takes a value parameter, where T is some copy and move constructible type:

    void f(T x) {
      T y{std::move(x)};
    }
    

    Calling f with an lvalue argument will result in a copy constructor being called to construct x, and a move constructor being called to construct y. On the other hand, calling f with an rvalue argument will cause a move constructor to be called to construct x, and another move constructor to be called to construct y.

    In general, the optimal implementation of f for lvalue arguments is as follows:

    void f(const T& x) {
      T y{x};
    }
    

    In this case, only one copy constructor is called to construct y. The optimal implementation of f for rvalue arguments is, again in general, as follows:

    void f(T&& x) {
      T y{std::move(x)};
    }
    

    In this case, only one move constructor is called to construct y.

    So a sensible compromise is to take a value parameter and have one extra move constructor call for either lvalue or rvalue arguments with respect to the optimal implementation, which is also the advice given in Herb's talk.

    As @JDługosz pointed out in the comments, passing by value only makes sense for functions that will construct some object from the sink argument. When you have a function f that copies its argument, the pass-by-value approach will have more overhead than a general pass-by-const-reference approach. The pass-by-value approach for a function f that retains a copy of its parameter will have the form:

    void f(T x) {
      T y{...};
      ...
      y = std::move(x);
    }
    

    In this case, there is a copy construction and a move assignment for an lvalue argument, and a move construction and move assignment for an rvalue argument. The most optimal case for an lvalue argument is:

    void f(const T& x) {
      T y{...};
      ...
      y = x;
    }
    

    This boils down to an assignment only, which is potentially much cheaper than the copy constructor plus move assignment required for the pass-by-value approach. The reason for this is that the assignment might reuse existing allocated memory in y, and therefore prevent (de)allocations, whereas the copy constructor will usually allocate memory.

    For an rvalue argument the most optimal implementation for f that retains a copy has the form:

    void f(T&& x) {
      T y{...};
      ...
      y = std::move(x);
    }
    

    So, only a move assignment in this case. Passing an rvalue to the version of f that takes a const reference only costs an assignment instead of a move assignment. So relatively speaking, the version of f taking a const reference in this case as the general implementation is preferable.

    So in general, for the most optimal implementation, you will need to overload or do some kind of perfect forwarding as shown in the talk. The drawback is a combinatorial explosion in the number of overloads required, depending on the number of parameters for f in case you opt to overload on the value category of the argument. Perfect forwarding has the drawback that f becomes a template function, which prevents making it virtual, and results in significantly more complex code if you want to get it 100% right (see the talk for the gory details).

    0 讨论(0)
  • 2020-11-22 17:27

    Herb Sutter is still on record, along with Bjarne Stroustroup, in recommending const std::string& as a parameter type; see https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-in .

    There is a pitfall not mentioned in any of the other answers here: if you pass a string literal to a const std::string& parameter, it will pass a reference to a temporary string, created on-the-fly to hold the characters of the literal. If you then save that reference, it will be invalid once the temporary string is deallocated. To be safe, you must save a copy, not the reference. The problem stems from the fact that string literals are const char[N] types, requiring promotion to std::string.

    The code below illustrates the pitfall and the workaround, along with a minor efficiency option -- overloading with a const char* method, as described at Is there a way to pass a string literal as reference in C++.

    (Note: Sutter & Stroustroup advise that if you keep a copy of the string, also provide an overloaded function with a && parameter and std::move() it.)

    #include <string>
    #include <iostream>
    class WidgetBadRef {
    public:
        WidgetBadRef(const std::string& s) : myStrRef(s)  // copy the reference...
        {}
    
        const std::string& myStrRef;    // might be a reference to a temporary (oops!)
    };
    
    class WidgetSafeCopy {
    public:
        WidgetSafeCopy(const std::string& s) : myStrCopy(s)
                // constructor for string references; copy the string
        {std::cout << "const std::string& constructor\n";}
    
        WidgetSafeCopy(const char* cs) : myStrCopy(cs)
                // constructor for string literals (and char arrays);
                // for minor efficiency only;
                // create the std::string directly from the chars
        {std::cout << "const char * constructor\n";}
    
        const std::string myStrCopy;    // save a copy, not a reference!
    };
    
    int main() {
        WidgetBadRef w1("First string");
        WidgetSafeCopy w2("Second string"); // uses the const char* constructor, no temp string
        WidgetSafeCopy w3(w2.myStrCopy);    // uses the String reference constructor
        std::cout << w1.myStrRef << "\n";   // garbage out
        std::cout << w2.myStrCopy << "\n";  // OK
        std::cout << w3.myStrCopy << "\n";  // OK
    }
    

    OUTPUT:

    const char * constructor
    const std::string& constructor
    
    Second string
    Second string
    
    0 讨论(0)
  • 2020-11-22 17:29

    IMO using the C++ reference for std::string is a quick and short local optimization, while using passing by value could be (or not) a better global optimization.

    So the answer is: it depends on circumstances:

    1. If you write all the code from the outside to the inside functions, you know what the code does, you can use the reference const std::string &.
    2. If you write the library code or use heavily library code where strings are passed, you likely gain more in global sense by trusting std::string copy constructor behavior.
    0 讨论(0)
  • 2020-11-22 17:31

    The problem is that "const" is a non-granular qualifier. What is usually meant by "const string ref" is "don't modify this string", not "don't modify the reference count". There is simply no way, in C++, to say which members are "const". They either all are, or none of them are.

    In order to hack around this language issue, STL could allow "C()" in your example to make a move-semantic copy anyway, and dutifully ignore the "const" with regard to the reference count (mutable). As long as it was well-specified, this would be fine.

    Since STL doesn't, I have a version of a string that const_casts<> away the reference counter (no way to retroactively make something mutable in a class hierarchy), and - lo and behold - you can freely pass cmstring's as const references, and make copies of them in deep functions, all day long, with no leaks or issues.

    Since C++ offers no "derived class const granularity" here, writing up a good specification and making a shiny new "const movable string" (cmstring) object is the best solution I've seen.

    0 讨论(0)
提交回复
热议问题