What kind of C11 data type is an array according to the AMD64 ABI

前端 未结 1 623
迷失自我
迷失自我 2020-12-04 03:27

I was researching the calling convention of x86_64 that\'s used on OSX and was reading the section called \"Aggregates and Unions\" in the System V x86-64 ABI standard). It

相关标签:
1条回答
  • 2020-12-04 03:29

    Bare arrays as function args in C and C++ always decay to pointers, just like in several other contexts.

    Arrays inside structs or unions don't, and are passed by value. This is why ABIs need to care about how they're passed, even though it doesn't happen in C for bare arrays.


    As Keith Thomson points out, the relevant part of the C standard is N1570 section 6.7.6.3 paragraph 7

    A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to type", where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation ... (stuff about foo[static 10], see below)

    Note that multidimensional arrays work as arrays of array type, so only the outer-most level of "array-ness" is converted to a pointer to array type.


    Terminology: The x86-64 ABI doc uses the same terminology as ARM, where structs and arrays are "aggregates" (multiple elements at sequential addresses). So the phrase "aggregates and unions" comes up a lot, because unions are handled similarly by the language and the ABI.

    It's the recursive rule for handling composite types (struct/union/class) that brings the array-passing rules in the ABI into play. This is the only way you'll see asm that copies an array to the stack as part of a function arg, for C or C++

    struct s { int a[8]; };
    void ext(struct s byval);
    
    void foo() { struct s tmp = {{0}}; ext(tmp); }
    

    gcc6.1 compiles it (for the AMD64 SysV ABI, with -O3) to the following:

        sub     rsp, 40    # align the stack and leave room for `tmp` even though it's never stored?
        push    0
        push    0
        push    0
        push    0
        call    ext
        add     rsp, 72
        ret
    

    In the x86-64 ABI, pass-by-value happens by actual copying (into registers or the stack), not by hidden pointers.

    Note that return-by-value does pass a pointer as a "hidden" first arg (in rdi), when the return value is too large to fit in the 128bit concatenation of rdx:rax (and isn't a vector being returned in vector regs, etc. etc.)

    It would be possible for the ABI to use a hidden pointer to pass-by-value objects above a certain size, and trust the called function not to modify the original, but that's not what the x86-64 ABI chooses to do. That would be better in some cases (especially for inefficient C++ with lots of copying without modification (i.e. wasted)), but worse in other cases.

    SysV ABI bonus reading: As the x86 tag wiki points out, the current version of the ABI standard doesn't fully document the behaviour that compilers rely on: clang/gcc sign/zero extend narrow args to 32bit.


    Note that to really guarantee that a function arg is a fixed-size array, C99 and later lets you use the static keyword in a new way: on array sizes. (It's still passed as a pointer, of course. This doesn't change the ABI).

    void bar(int arr[static 10]);
    

    This lets sizeof(arr) work as you might expect inside the called function, and allows compiler warnings about going out of bounds. It also potentially enables better optimization if the compiler knows it's allowed to access elements that the C source doesn't. (See this blog post).

    The same keyword page for C++ indicates that ISO C++ does not support this usage of static; it's another one of those C-only features, along with C99 variable-length-arrays and a few other goodies that C++ doesn't have.

    In C++, you can use std::array<int,10> to get compile-time size information passed to the caller. However, you have to manually pass it by reference if that's what you want, since it's of course just a class containing an int arr[10]. Unlike a C-style array, it doesn't decay to T* automatically.


    The ARM doc that you linked doesn't seem to actually call arrays an aggregate type: Section 4.3 Composite Types (which discusses alignment) distinguishes arrays from aggregate types, even though they appear to be a special case of its definition for aggregates.

    A Composite Type is a collection of one or more Fundamental Data Types that are handled as a single entity at the procedure call level. A Composite Type can be any of:

    • An aggregate, where the members are laid out sequentially in memory
    • A union, where each of the members has the same address
    • An array, which is a repeated sequence of some other type (its base type).

    The definitions are recursive; that is, each of the types may contain a Composite Type as a member

    "Composite" is an umbrella term that includes arrays, structs, and unions.

    0 讨论(0)
提交回复
热议问题