Let\'s say I want to implement a function that is supposed to process an object and return a new (possibly changed) object. I would like to do this as efficient as possible in C
I like to measure, so I set up this Object
:
#include
struct Object
{
Object() {}
Object(const Object&) {std::cout << "Object(const Object&)\n";}
Object(Object&&) {std::cout << "Object(Object&&)\n";}
Object& makeChanges() {return *this;}
};
And I theorized that some solutions may give different answers for xvalues and prvalues (both of which are rvalues). And so I decided to test both of them (in addition to lvalues):
Object source() {return Object();}
int main()
{
std::cout << "process lvalue:\n\n";
Object x;
Object t = process(x);
std::cout << "\nprocess xvalue:\n\n";
Object u = process(std::move(x));
std::cout << "\nprocess prvalue:\n\n";
Object v = process(source());
}
Now it is a simple matter of trying all of your possibilities, those contributed by others, and I threw one in myself:
#if PROCESS == 1
Object
process(Object arg)
{
return arg.makeChanges();
}
#elif PROCESS == 2
Object
process(const Object& arg)
{
return Object(arg).makeChanges();
}
Object
process(Object&& arg)
{
return std::move(arg.makeChanges());
}
#elif PROCESS == 3
Object
process(const Object& arg)
{
Object retObj = arg;
retObj.makeChanges();
return retObj;
}
Object
process(Object&& arg)
{
return std::move(arg.makeChanges());
}
#elif PROCESS == 4
Object
process(Object arg)
{
return std::move(arg.makeChanges());
}
#elif PROCESS == 5
Object
process(Object arg)
{
arg.makeChanges();
return arg;
}
#endif
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue | legend: copies/moves
+----+--------+--------+---------+
| p1 | 2/0 | 1/1 | 1/0 |
+----+--------+--------+---------+
| p2 | 2/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
| p3 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
| p4 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p5 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
process3
looks like the best solution to me. However it does require two overloads. One to process lvalues and one to process rvalues. If for some reason this is problematic, solutions 4 and 5 do the job with only one overload at the cost of 1 extra move construction for glvalues (lvalues and xvalues). It is a judgement call as to whether one wants to pay an extra move construction to save overloading (and there is no one right answer).
(answered) Why does RVO kick in the last option and not the second?
For RVO to kick in, the return statement needs to look like:
return arg;
If you complicate that with:
return std::move(arg);
or:
return arg.makeChanges();
then RVO gets inhibited.
Is there a better way to do this?
My favorites are p3 and p5. My preference of p5 over p4 is merely stylistic. I shy away from putting move
on the return
statement when I know it will be applied automatically for fear of accidentally inhibiting RVO. However in p5 RVO is not an option anyway, even though the return statement does get an implicit move. So p5 and p4 really are equivalent. Pick your style.
Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?
The "prvalue" column vs "xvalue" column addresses this question. Some solutions add an extra move construction for xvalues and some don't.