What I understand is that this shouldn\'t be done, but I believe I\'ve seen examples that do something like this (note code is not necessarily syntactically correct but the
I will also agree with sftrabbit , Life indeed ends and stack area gets cleared up but the compiler is smart enough to ensure that all the data should be retrieved in registers or someother way.
A simple example for confirmation is given below.(taken from Mingw compiler assembly)
_func:
push ebp
mov ebp, esp
sub esp, 16
mov eax, DWORD PTR [ebp+8]
mov DWORD PTR [ebp-8], eax
mov eax, DWORD PTR [ebp+12]
mov DWORD PTR [ebp-4], eax
mov eax, DWORD PTR [ebp-8]
mov edx, DWORD PTR [ebp-4]
leave
ret
You can see that the value of b has been transmitted through edx. while the default eax contains value for a.
It's perfectly safe.
You're returning by value. What would lead to undefined behavior is if you were returning by reference.
//safe
mystruct func(int c, int d){
mystruct retval;
retval.a = c;
retval.b = d;
return retval;
}
//undefined behavior
mystruct& func(int c, int d){
mystruct retval;
retval.a = c;
retval.b = d;
return retval;
}
The behavior of your snippet is perfectly valid and defined. It doesn't vary by compiler. It's ok!
Personally I always either return a pointer to a malloc'ed struct
You shouldn't. You should avoid dynamically allocated memory when possible.
or just do a pass by reference to the function and modify the values there.
This option is perfectly valid. It's a matter of choice. In general, you do this if you want to return something else from the function, while modifying the original struct.
Because my understanding is that once the scope of the function is over, whatever stack was used to allocate the structure can be overwritten
This is wrong. I meant, it's sort of correct, but you return a copy of the structure you create inside the function. Theoretically. In practice, RVO can and probably will occur. Read up on return value optimization. This means that although retval
appears to go out of scope when the function ends, it might actually be built in the calling context, to prevent the extra copy. This is an optimization the compiler is free to implement.
Note: this answer only applies to c++11 onward. There is no such thing as "C/C++", they are different languages.
No, there is no danger in returning a local object by value, and it is recommended to do so. However, I think there is an important point that is missing from all answers here. Many others have said that the struct is being either copied or directly placed using RVO. However, this is not completely correct. I will try to explain exactly which things can happen when returning a local object.
Since c++11, we have had rvalue references which are references to temporary objects which can be stolen from safely. As an example, std::vector has a move constructor as well as a move assignment operator. Both of these have constant complexity and simply copy the pointer to the data of the vector being moved from. I won't go into more detail about move semantics here.
Because an object created locally within a function is temporary and goes out of scope when the function returns, a returned object is never copied with c++11 onward. The move constructor is being called on the object being returned (or not, explained later). This means that if you were to return an object with an expensive copy constructor but inexpensive move constructor, like a big vector, only the ownership of the data is transferred from the local object to the returned object - which is cheap.
Note that in your specific example, there is no difference between copying and moving the object. The default move and copy constructors of your struct result in the same operations; copying two integers. However, this is at least as fast than any other solution because the whole struct fits in a 64-bit CPU register (correct me if I'm wrong, I don't know much CPU registers).
RVO means Return Value Optimization and is one of the very few optimizations that compilers do which can have side effects. Since c++17, RVO is required. When returning an unnamed object, it is constructed directly in-place where the caller assigns the returned value. Neither the copy constructor nor the move constructor is called. Without RVO, the unnamed object would be first constructed locally, then move constructed in the returned address, then the local unnamed object is destructed.
Example where RVO is required (c++17) or likely (before c++17):
auto function(int a, int b) -> MyStruct {
// ...
return MyStruct{a, b};
}
NRVO means Named Return Value Optimization and is the same thing as RVO except it is done for a named object local to the called function. This is still not guaranteed by the standard (c++20) but many compilers still do it. Note that even with named local objects, they are at worst being moved when returned.
The only case where you should consider not returning by value is when you have a named, very large (as in its stack size) object. This is because NRVO is not yet guaranteed (as of c++20) and even moving the object would be slow. My recommendation, and the recommendation in the Cpp Core Guidelines is to always prefer returning objects by value (if multiple return values, use struct (or tuple)), where the only exception is when the object is expensive to move. In that case, use a non-const reference parameter.
It is NEVER a good idea to return a resource that has to be manually released from a function in c++. Never do that. At least use an std::unique_ptr, or make your own non-local or local struct with a destructor that releases its resource (RAII) and return an instance of that. It would then also be a good idea to define the move constructor and move assignment operator if the resource does not have its own move semantics (and delete copy constructor/assignment).
It's perfectly safe, and it's not wrong to do so. Also: it does not vary by compiler.
Usually, when (like your example) your struct is not too big I would argue that this approach is even better than returning a malloc'ed structure (malloc
is an expensive operation).
Not only it is safe to return a struct
in C (or a class
in C++, where struct
-s are actually class
-es with default public:
members), but a lot of software is doing that.
Of course, when returning a class
in C++, the language specifies that some destructor or moving constructor would be called, but there are many cases where this could be optimized by the compiler.
In addition, the Linux x86-64 ABI specifies that returning a struct
with two scalar (e.g. pointers, or long
) values is done thru registers (%rax
& %rdx
) so is very fast and efficient. So for that particular case it is probably faster to return such a two-scalar fields struct
than to do anything else (e.g. storing them into a pointer passed as argument).
Returning such a two-scalar field struct
is then a lot faster than malloc
-ing it and returning a pointer.
Let's add a second part to the question: Does this vary by compiler?
Indeed it does, as I discovered to my pain: http://sourceforge.net/p/mingw-w64/mailman/message/33176880/
I was using gcc on win32 (MinGW) to call COM interfaces that returned structs. Turns out that MS does it differently to GNU and so my (gcc) program crashed with a smashed stack.
It could be that MS might have the higher ground here - but all I care about is ABI compatibility between MS and GNU for building on Windows.
If it does, then what is the behavior for the latest versions of compilers for desktops: gcc, g++ and Visual Studio
You can find some messages on a Wine mailing list about how MS seems to do it.