GCC/Clang x86_64 C++ ABI mismatch when returning a tuple?

时光毁灭记忆、已成空白 提交于 2019-12-03 09:22:34

问题


When trying to optimize return values on x86_64, I noticed a strange thing. Namely, given the code:

#include <cstdint>
#include <tuple>
#include <utility>

using namespace std;

constexpr uint64_t a = 1u;
constexpr uint64_t b = 2u;

pair<uint64_t, uint64_t> f() { return {a, b}; }
tuple<uint64_t, uint64_t> g() { return tuple<uint64_t, uint64_t>{a, b}; }

Clang 3.8 outputs this assembly code for f:

movl $1, %eax
movl $2, %edx
retq

and this for g:

movl $2, %eax
movl $1, %edx
retq

which look optimal. However, when compiled with GCC 6.1, while the generated assembly for f is identical to what Clang output, the assembly generated for g is:

movq %rdi, %rax
movq $2, (%rdi)
movq $1, 8(%rdi)
ret

It looks like the type of the return value is classified as MEMORY by GCC but as INTEGER by Clang. I can confirm that linking Clang code with GCC code such code can result in segmentation faults (Clang calling GCC-compiled g() which writes to wherever %rdi happens to point) and an invalid value being returned (GCC calling Clang-compiled g()). Which compiler is at fault?

Related:

  • G++ and clang++ incompatibility with standard library when building shared libraries?
  • [cxx-abi-dev] Non-trivial move constructor

See also

  • System V Application Binary Interface. AMD64 Architecture Processor Supplement. Draft Version 0.99.5

回答1:


As davmac's answer shows, the libstdc++ std::tuple is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions.

The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html

In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.




回答2:


The ABI states that parameter values are classified according to a specific algorithm. Relevant here is:

  1. If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.

  2. Each field of an object is classified recursively so that always two fields are considered. The resulting class is calculated according to the classes of the fields in the eightbyte:

In this case, each of the fields (for either a tuple or a pair) are of type uint64_t and so occupy an entire "eightbyte". The "two fields" to be considered in each eightbyte, then, are the "NO_CLASS" (as per 3) and the uint64_t field, which is classified as INTEGER.

There is also, related to parameter passing:

If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER)

An object that doesn't meet those requirements must have an address, and therefore needs to be in memory, which is why the above requirement exists. The same is true for return values, though this seems to be an omitted in the specification (probably by accident).

Finally, there is:

(c) If the size of the aggregate exceeds two eightbytes and the first eight-byte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory.

That doesn't apply here, obviously; the size of the aggregate is exactly two eightbytes.

On returning of values, the text says:

  1. Classify the return type with the classification algorithm

Which means, as per above, that the tuple should be classifed as INTEGER. Then:

  1. If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

This is quite clear.

The only still-open question is whether the types are non-trivially-copy-constructible/destructible. As mentioned above, values of such type cannot be passed or returned in registers, even though the specification does not seem to recognize the problem for return values. However, we can easily show that the tuple and pair are both trivially-copy-constructible and trivially-destructible, using the following program:

Test program:

#include <utility>
#include <cstdint>
#include <tuple>
#include <iostream>

using namespace std;

int main(int argc, char **argv)
{
    cout << "pair is trivial? : " << is_trivial<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_copy_constructible? : " << is_trivially_copy_constructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is standard_layout? : " << is_standard_layout<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is pod? : " << is_pod<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_destructable? : " << is_trivially_destructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_move_constructible? : " << is_trivially_move_constructible<pair<uint64_t, uint64_t> >::value << endl;

    cout << "tuple is trivial? : " << is_trivial<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_copy_constructible? : " << is_trivially_copy_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is standard_layout? : " << is_standard_layout<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is pod? : " << is_pod<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_destructable? : " << is_trivially_destructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_move_constructible? : " << is_trivially_move_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    return 0;
}

Output when compiled with GCC or Clang:

pair is trivial? : 0
pair is trivially_copy_constructible? : 1
pair is standard_layout? : 1
pair is pod? : 0
pair is trivially_destructable? : 1
pair is trivially_move_constructible? : 1
tuple is trivial? : 0
tuple is trivially_copy_constructible? : 1
tuple is standard_layout? : 0
tuple is pod? : 0
tuple is trivially_destructable? : 1
tuple is trivially_move_constructible? : 0

This implies that GCC is getting it wrong. The return value should be passed in %rax,%rdx.

(The main noticable differences between the types is that pair is standard layout and is trivially move-constructible whereas tuple is not, so it's possible that GCC is always returning non-trivially-move-constructible values via a pointer, for example).



来源:https://stackoverflow.com/questions/37457443/gcc-clang-x86-64-c-abi-mismatch-when-returning-a-tuple

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!