I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector
and std::string
by const &
are largely gon
See “Herb Sutter "Back to the Basics! Essentials of Modern C++ Style”. Among other topics, he reviews the parameter passing advice that’s been given in the past, and new ideas that come in with C++11 and specifically looks at the idea of passing strings by value.
The benchmarks show that passing std::string
s by value, in cases where the function will copy it in anyway, can be significantly slower!
This is because you are forcing it to always make a full copy (and then move into place), while the const&
version will update the old string which may reuse the already-allocated buffer.
See his slide 27: For “set” functions, option 1 is the same as it always was. Option 2 adds an overload for rvalue reference, but this gives a combinatorial explosion if there are multiple parameters.
It is only for “sink” parameters where a string must be created (not have its existing value changed) that the pass-by-value trick is valid. That is, constructors in which the parameter directly initializes the member of the matching type.
If you want to see how deep you can go in worrying about this, watch Nicolai Josuttis’s presentation and good luck with that (“Perfect — Done!” n times after finding fault with the previous version. Ever been there?)
This is also summarized as ⧺F.15 in the Standard Guidelines.
I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question.
Here is code to measure what is being asked:
#include <iostream>
struct string
{
string() {}
string(const string&) {std::cout << "string(const string&)\n";}
string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
string(string&&) {std::cout << "string(string&&)\n";}
string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif
};
#if PROCESS == 1
string
do_something(string inval)
{
// do stuff
return inval;
}
#elif PROCESS == 2
string
do_something(const string& inval)
{
string return_val = inval;
// do stuff
return return_val;
}
#if (__has_feature(cxx_rvalue_references))
string
do_something(string&& inval)
{
// do stuff
return std::move(inval);
}
#endif
#endif
string source() {return string();}
int main()
{
std::cout << "do_something with lvalue:\n\n";
string x;
string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
std::cout << "\ndo_something with xvalue:\n\n";
string u = do_something(std::move(x));
#endif
std::cout << "\ndo_something with prvalue:\n\n";
string v = do_something(source());
}
For me this outputs:
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
string(string&&)
do_something with xvalue:
string(string&&)
string(string&&)
do_something with prvalue:
string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:
string(const string&)
do_something with xvalue:
string(string&&)
do_something with prvalue:
string(string&&)
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p2 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. This may or may not be acceptable for any given situation. Both solutions have advantages and disadvantages.
The reason Herb said what he said is because of cases like this.
Let's say I have function A
which calls function B
, which calls function C
. And A
passes a string through B
and into C
. A
does not know or care about C
; all A
knows about is B
. That is, C
is an implementation detail of B
.
Let's say that A is defined as follows:
void A()
{
B("value");
}
If B and C take the string by const&
, then it looks something like this:
void B(const std::string &str)
{
C(str);
}
void C(const std::string &str)
{
//Do something with `str`. Does not store it.
}
All well and good. You're just passing pointers around, no copying, no moving, everyone's happy. C
takes a const&
because it doesn't store the string. It simply uses it.
Now, I want to make one simple change: C
needs to store the string somewhere.
void C(const std::string &str)
{
//Do something with `str`.
m_str = str;
}
Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO)). C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? And A
passes a temporary; there's no reason why C
should have to copy the data. It should just abscond with what was given to it.
Except it can't. Because it takes a const&
.
If I change C
to take its parameter by value, that just causes B
to do the copy into that parameter; I gain nothing.
So if I had just passed str
by value through all of the functions, relying on std::move
to shuffle the data around, we wouldn't have this problem. If someone wants to hold on to it, they can. If they don't, oh well.
Is it more expensive? Yes; moving into a value is more expensive than using references. Is it less expensive than the copy? Not for small strings with SSO. Is it worth doing?
It depends on your use case. How much do you hate memory allocations?
This highly depends on the compiler's implementation.
However, it also depends on what you use.
Lets consider next functions :
bool foo1( const std::string v )
{
return v.empty();
}
bool foo2( const std::string & v )
{
return v.empty();
}
These functions are implemented in a separate compilation unit in order to avoid inlining. Then :
1. If you pass a literal to these two functions, you will not see much difference in performances. In both cases, a string object has to be created
2. If you pass another std::string object, foo2
will outperform foo1
, because foo1
will do a deep copy.
On my PC, using g++ 4.6.1, I got these results :
Unless you actually need a copy it's still reasonable to take const &
. For example:
bool isprint(std::string const &s) {
return all_of(begin(s),end(s),(bool(*)(char))isprint);
}
If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. Not only is copy/move likely more expensive, but it also introduces a new potential failure; the copy/move could throw an exception (e.g., allocation during copy could fail) whereas taking a reference to an existing value can't.
If you do need a copy then passing and returning by value is usually (always?) the best option. In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. Copy elision seems pretty reliable on modern compilers. I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays.
In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision.
Short answer: NO! Long answer:
const ref&
.const ref&
obviously needs to stay within scope while the function that uses it executes)value
, don't copy the const ref&
inside your function body.There was a post on cpp-next.com called "Want speed, pass by value!". The TL;DR:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
TRANSLATION of ^
Don’t copy your function arguments --- means: if you plan to modify the argument value by copying it to an internal variable, just use a value argument instead.
So, don't do this:
std::string function(const std::string& aString){
auto vString(aString);
vString.clear();
return vString;
}
do this:
std::string function(std::string aString){
aString.clear();
return aString;
}
When you need to modify the argument value in your function body.
You just need to be aware how you plan to use the argument in the function body. Read-only or NOT... and if it sticks within scope.