Let\'s say I want to implement a function that is supposed to process an object and return a new (possibly changed) object. I would like to do this as efficient as possible in C
The fastest way to do this is- if the argument is lvalue, then copy it and return that copy- if rvalue, then move it. The return can always be moved or have RVO/NRVO applied. This is easily accomplished.
Object process1(Object arg) {
return std::move(arg.makeChanges());
}
This is very similar to the canonical C++11 forms of many kinds of operator overloads.
None of the functions you show will have any significant return value optimizations on their return values.
makeChanges
returns an Object&
. Therefore, it must be copied into a value, since you're returning it. So the first two will always make a copy of the value to be returned. In terms of the number of copies, the first one makes two copies (one for the parameter, one for the return value). The second one makes two copies (one explicitly in the function, one for the return value.
The third one shouldn't even compile, since you can't implicitly convert an l-value reference into an r-value reference.
So really, don't do this. If you want to pass an object, and modify it in-situ, then just do this:
Object &process1(Object &arg) { return arg.makeChanges(); }
This modifies the provided object. No copying or anything. Granted, one might wonder why process1
isn't a member function or something, but that doesn't matter.
I like to measure, so I set up this Object
:
#include <iostream>
struct Object
{
Object() {}
Object(const Object&) {std::cout << "Object(const Object&)\n";}
Object(Object&&) {std::cout << "Object(Object&&)\n";}
Object& makeChanges() {return *this;}
};
And I theorized that some solutions may give different answers for xvalues and prvalues (both of which are rvalues). And so I decided to test both of them (in addition to lvalues):
Object source() {return Object();}
int main()
{
std::cout << "process lvalue:\n\n";
Object x;
Object t = process(x);
std::cout << "\nprocess xvalue:\n\n";
Object u = process(std::move(x));
std::cout << "\nprocess prvalue:\n\n";
Object v = process(source());
}
Now it is a simple matter of trying all of your possibilities, those contributed by others, and I threw one in myself:
#if PROCESS == 1
Object
process(Object arg)
{
return arg.makeChanges();
}
#elif PROCESS == 2
Object
process(const Object& arg)
{
return Object(arg).makeChanges();
}
Object
process(Object&& arg)
{
return std::move(arg.makeChanges());
}
#elif PROCESS == 3
Object
process(const Object& arg)
{
Object retObj = arg;
retObj.makeChanges();
return retObj;
}
Object
process(Object&& arg)
{
return std::move(arg.makeChanges());
}
#elif PROCESS == 4
Object
process(Object arg)
{
return std::move(arg.makeChanges());
}
#elif PROCESS == 5
Object
process(Object arg)
{
arg.makeChanges();
return arg;
}
#endif
The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:
+----+--------+--------+---------+
| | lvalue | xvalue | prvalue | legend: copies/moves
+----+--------+--------+---------+
| p1 | 2/0 | 1/1 | 1/0 |
+----+--------+--------+---------+
| p2 | 2/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
| p3 | 1/0 | 0/1 | 0/1 |
+----+--------+--------+---------+
| p4 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
| p5 | 1/1 | 0/2 | 0/1 |
+----+--------+--------+---------+
process3
looks like the best solution to me. However it does require two overloads. One to process lvalues and one to process rvalues. If for some reason this is problematic, solutions 4 and 5 do the job with only one overload at the cost of 1 extra move construction for glvalues (lvalues and xvalues). It is a judgement call as to whether one wants to pay an extra move construction to save overloading (and there is no one right answer).
(answered) Why does RVO kick in the last option and not the second?
For RVO to kick in, the return statement needs to look like:
return arg;
If you complicate that with:
return std::move(arg);
or:
return arg.makeChanges();
then RVO gets inhibited.
Is there a better way to do this?
My favorites are p3 and p5. My preference of p5 over p4 is merely stylistic. I shy away from putting move
on the return
statement when I know it will be applied automatically for fear of accidentally inhibiting RVO. However in p5 RVO is not an option anyway, even though the return statement does get an implicit move. So p5 and p4 really are equivalent. Pick your style.
Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?
The "prvalue" column vs "xvalue" column addresses this question. Some solutions add an extra move construction for xvalues and some don't.