I have a function which modifies std::string&
lvalue references in-place, returning a reference to the input parameter:
std::string& tra
This leads me to believe a copy would happen from the std::string& return value of the lvalue reference version of transform(...) into the std::string return value.
Is that correct?
The return reference version will not let std::string copy happened, but the return value version will have copy, if the compiler does not do RVO. However, RVO has its limitation, so C++11 add r-value reference and move constructor / assignment / std::move to help handle this situation. Yes, RVO is more efficient than move semantic, move is cheaper than copy but more expensive than RVO.
Is it better to keep my std::string&& transform(...) version?
This is somehow interesting and strange. As Potatoswatter answered,
std::string transform(std::string&& input)
{
return transform(input); // calls the lvalue reference version
}
You should call std::move manually.
However, you can click this developerworks link: RVO V.S. std::move to see more detail, which explain your problem clearly.
if your question is pure optimization oriented it's best to not worry about how to pass or return an argument. the compiler is smart enough to strech your code into either pure-reference passing , copy elision, function inlining and even move semantics if it's the fastest method.
basically, move semantics can benefit you in some esoteric cases. let's say I have a matrix objects that holds double**
as a member variable and this pointer points to a two dimenssional array of double
. now let's say I have this expression:
Matrix a = b+c;
a copy constructor (or assigment operator, in this case) will get the sum of b
and c
as a temorary, pass it as const reference, re-allocate m*n
amount of doubles
on a
inner pointer, then, it will run on a+b
sum-array and will copy its values one by one.
easy computation shows that it can take up to O(nm)
steps (which can be generlized to O(n^2)
). move semantics will only re-wire that hidden double**
out of the temprary into a
inner pointer. it takes O(1)
.
now let's think about std::string
for a moment:
passing it as a reference takes O(1)
steps (take the memory addres, pass it , dereference it etc. , this is not linear in any sort).
passing it as r-value-reference requires the program to pass it as a reference, re-wire that hidden underlying C-char*
which holds the inner buffer, null the original one (or swap between them), copy size
and capacity
and many more actions. we can see that although we're still in the O(1)
zone - there can be actualy MORE steps than simply pass it as a regular reference.
well, the truth is that I didn't benchmarked it, and the discussion here is purely theoratical. never the less, my first paragraph is still true. we assume many things as developers, but unless we benchmark everything to death - the compiler simply knows better than us in 99% of the time
taking this argument into acount, I'd say to keep it as a reference-pass and not move semantics since it's backword compatible and much more understood for developers who didn't master C++11 yet.
There's no right answer, but returning by value is safer.
I have read several questions on SO relating to returning rvalue references, and have come to the conclusion that this is bad practice.
Returning a reference to a parameter foists a contract upon the caller that either
If the caller passes a temporary and tries to save the result, they get a dangling reference.
From what I have read, it seems the consensus is that since return values are rvalues, plus taking into account the RVO, just returning by value would be as efficient:
Returning by value adds a move-construction operation. The cost of this is usually proportional to the size of the object. Whereas returning by reference only requires the machine to ensure that one address is in a register, returning by value requires zeroing a couple pointers in the parameter std::string
and putting their values in a new std::string
to be returned.
It's cheap, but nonzero.
The direction currently taken by the standard library is, somewhat surprisingly, to be fast and unsafe and return the reference. (The only function I know that actually does this is std::get
from <tuple>
.) As it happens, I've presented a proposal to the C++ core language committee toward the resolution of this issue, a revision is in the works, and just today I've started investigating implementation. But it's complicated, and not a sure thing.
std::string transform(std::string&& input) { return transform(input); // calls the lvalue reference version }
The compiler won't generate a move
here. If input
weren't a reference at all, and you did return input;
it would, but it has no reason to believe that transform
will return input
just because it was a parameter, and it won't deduce ownership from rvalue reference type anyway. (See C++14 §12.8/31-32.)
You need to do:
return std::move( transform( input ) );
or equivalently
transform( input );
return std::move( input );
Some (non-representative) runtimes for the above versions of transform
:
run on coliru
#include <iostream>
#include <time.h>
#include <sys/time.h>
#include <unistd.h>
using namespace std;
double GetTicks()
{
struct timeval tv;
if(!gettimeofday (&tv, NULL))
return (tv.tv_sec*1000 + tv.tv_usec/1000);
else
return -1;
}
std::string& transform(std::string& input)
{
// transform the input string
// e.g toggle first character
if(!input.empty())
{
if(input[0]=='A')
input[0] = 'B';
else
input[0] = 'A';
}
return input;
}
std::string&& transformA(std::string&& input)
{
return std::move(transform(input));
}
std::string transformB(std::string&& input)
{
return transform(input); // calls the lvalue reference version
}
std::string transformC(std::string&& input)
{
return std::move( transform( input ) ); // calls the lvalue reference version
}
string getSomeString()
{
return string("ABC");
}
int main()
{
const int MAX_LOOPS = 5000000;
{
double start = GetTicks();
for(int i=0; i<MAX_LOOPS; ++i)
string s = transformA(getSomeString());
double end = GetTicks();
cout << "\nRuntime transformA: " << end - start << " ms" << endl;
}
{
double start = GetTicks();
for(int i=0; i<MAX_LOOPS; ++i)
string s = transformB(getSomeString());
double end = GetTicks();
cout << "\nRuntime transformB: " << end - start << " ms" << endl;
}
{
double start = GetTicks();
for(int i=0; i<MAX_LOOPS; ++i)
string s = transformC(getSomeString());
double end = GetTicks();
cout << "\nRuntime transformC: " << end - start << " ms" << endl;
}
return 0;
}
output
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
Runtime transformA: 444 ms
Runtime transformB: 796 ms
Runtime transformC: 434 ms