Is it legal to index into a struct?

后端 未结 10 1754
悲哀的现实
悲哀的现实 2020-11-30 01:12

Regardless of how \'bad\' the code is, and assuming that alignment etc are not an issue on the compiler/platform, is this undefined or broken behavior?

If I have a s

相关标签:
10条回答
  • 2020-11-30 01:53

    If reading values is enough, and efficiency is not a concern, or if you trust your compiler to optimize things well, or if struct is just that 3 bytes, you can safely do this:

    char index_data(const struct data *d, size_t index) {
      assert(sizeof(*d) == offsetoff(*d, c)+1);
      assert(index < sizeof(*d));
      char buf[sizeof(*d)];
      memcpy(buf, d, sizeof(*d));
      return buf[index];
    }
    

    For C++ only version, you would probably want to use static_assert to verify that struct data has standard layout, and perhaps throw exception on invalid index instead.

    0 讨论(0)
  • 2020-11-30 01:58

    For c++: If you need to access a member without knowing its name, you can use a pointer to member variable.

    struct data {
      int a, b, c;
    };
    
    typedef int data::* data_int_ptr;
    
    data_int_ptr arr[] = {&data::a, &data::b, &data::c};
    
    data thing;
    thing.*arr[0] = 123;
    
    0 讨论(0)
  • 2020-11-30 01:58

    It is illegal, but there is a workaround:

    struct data {
        union {
            struct {
                int a;
                int b;
                int c;
            };
            int v[3];
        };
    };
    

    Now you can index v:

    0 讨论(0)
  • 2020-11-30 02:00

    In C++, this is mostly undefined behavior (it depends on which index).

    From [expr.unary.op]:

    For purposes of pointer arithmetic (5.7) and comparison (5.9, 5.10), an object that is not an array element whose address is taken in this way is considered to belong to an array with one element of type T.

    The expression &thing.a is thus considered to refer to an array of one int.

    From [expr.sub]:

    The expression E1[E2] is identical (by definition) to *((E1)+(E2))

    And from [expr.add]:

    When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 <= i + j <= n; otherwise, the behavior is undefined.

    (&thing.a)[0] is perfectly well-formed because &thing.a is considered an array of size 1 and we're taking that first index. That is an allowed index to take.

    (&thing.a)[2] violates the precondition that 0 <= i + j <= n, since we have i == 0, j == 2, n == 1. Simply constructing the pointer &thing.a + 2 is undefined behavior.

    (&thing.a)[1] is the interesting case. It doesn't actually violate anything in [expr.add]. We're allowed to take a pointer one past the end of the array - which this would be. Here, we turn to a note in [basic.compound]:

    A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory (1.7) occupied by the object53 or the first byte in memory after the end of the storage occupied by the object, respectively. [ Note: A pointer past the end of an object (5.7) is not considered to point to an unrelated object of the object’s type that might be located at that address.

    Hence, taking the pointer &thing.a + 1 is defined behavior, but dereferencing it is undefined because it does not point to anything.

    0 讨论(0)
  • 2020-11-30 02:01

    No. In C, this is undefined behavior even if there is no padding.

    The thing that causes undefined behavior is out-of-bounds access1. When you have a scalar (members a,b,c in the struct) and try to use it as an array2 to access the next hypothetical element, you cause undefined behavior, even if there happens to be another object of the same type at that address.

    However you may use the address of the struct object and calculate the offset into a specific member:

    struct data thing = { 0 };
    char* p = ( char* )&thing + offsetof( thing , b );
    int* b = ( int* )p;
    *b = 123;
    assert( thing.b == 123 );
    

    This has to be done for each member individually, but can be put into a function that resembles an array access.


    1 (Quoted from: ISO/IEC 9899:201x 6.5.6 Additive operators 8)
    If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

    2 (Quoted from: ISO/IEC 9899:201x 6.5.6 Additive operators 7)
    For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

    0 讨论(0)
  • 2020-11-30 02:05

    In C++ if you really need it - create operator[]:

    struct data
    {
        int a, b, c;
        int &operator[]( size_t idx ) {
            switch( idx ) {
                case 0 : return a;
                case 1 : return b;
                case 2 : return c;
                default: throw std::runtime_error( "bad index" );
            }
        }
    };
    
    
    data d;
    d[0] = 123; // assign 123 to data.a
    

    it is not only guaranteed to work but usage is simpler, you do not need to write unreadable expression (&thing.a)[0]

    Note: this answer is given in assumption that you already have a structure with fields, and you need to add access via index. If speed is an issue and you can change the structure this could be more effective:

    struct data 
    {
         int array[3];
         int &a = array[0];
         int &b = array[1];
         int &c = array[2];
    };
    

    This solution would change size of structure so you can use methods as well:

    struct data 
    {
         int array[3];
         int &a() { return array[0]; }
         int &b() { return array[1]; }
         int &c() { return array[2]; }
    };
    
    0 讨论(0)
提交回复
热议问题