Regardless of how \'bad\' the code is, and assuming that alignment etc are not an issue on the compiler/platform, is this undefined or broken behavior?
If I have a s
It is illegal 1. That's an Undefined behavior in C++.
You are taking the members in an array fashion, but here is what the C++ standard says (emphasis mine):
[dcl.array/1]: ...An object of array type contains a contiguously allocated non-empty set of N subobjects of type T...
But, for members, there's no such contiguous requirement:
[class.mem/17]: ...;Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other...
While the above two quotes should be enough to hint why indexing into a struct
as you did isn't a defined behavior by the C++ standard, let's pick one example: look at the expression (&thing.a)[2]
- Regarding the subscript operator:
[expr.post//expr.sub/1]: A postfix expression followed by an expression in square brackets is a postfix expression. One of the expressions shall be a glvalue of type “array of T” or a prvalue of type “pointer to T” and the other shall be a prvalue of unscoped enumeration or integral type. The result is of type “T”. The type “T” shall be a completely-defined object type.66 The expression
E1[E2]
is identical (by definition) to((E1)+(E2))
Digging into the bold text of the above quote: regarding adding an integral type to a pointer type (note the emphasis here)..
[expr.add/4]: When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression
P
points to elementx[i]
of an array objectx
with n elements, the expressionsP + J
andJ + P
(whereJ
has the valuej
) point to the (possibly-hypothetical) elementx[i + j]
if0 ≤ i + j ≤ n
; otherwise, the behavior is undefined. ...
Note the array requirement for the if clause; else the otherwise in the above quote. The expression (&thing.a)[2]
obviously doesn't qualify for the if clause; Hence, Undefined Behavior.
On a side note: Though I have extensively experimented the code and its variations on various compilers and they don't introduce any padding here, (it works); from a maintenance view, the code is extremely fragile. you should still assert that the implementation allocated the members contiguously before doing this. And stay in-bounds :-). But its still Undefined behavior....
Some viable workarounds (with defined behavior) have been provided by other answers.
As rightly pointed out in the comments, [basic.lval/8], which was in my previous edit doesn't apply. Thanks @2501 and @M.M.
1: See @Barry's answer to this question for the only one legal case where you can access thing.a
member of the struct via this parttern.
Heres a way to use a proxy class to access elements in a member array by name. It is very C++, and has no benefit vs. ref-returning accessor functions, except for syntactic preference. This overloads the ->
operator to access elements as members, so to be acceptable, one needs to both dislike the syntax of accessors (d.a() = 5;
), as well as tolerate using ->
with a non-pointer object. I expect this might also confuse readers not familiar with the code, so this might be more of a neat trick than something you want to put into production.
The Data
struct in this code also includes overloads for the subscript operator, to access indexed elements inside its ar
array member, as well as begin
and end
functions, for iteration. Also, all of these are overloaded with non-const and const versions, which I felt needed to be included for completeness.
When Data
's ->
is used to access an element by name (like this: my_data->b = 5;
), a Proxy
object is returned. Then, because this Proxy
rvalue is not a pointer, its own ->
operator is auto-chain-called, which returns a pointer to itself. This way, the Proxy
object is instantiated and remains valid during evaluation of the initial expression.
Contruction of a Proxy
object populates its 3 reference members a
, b
and c
according to a pointer passed in the constructor, which is assumed to point to a buffer containing at least 3 values whose type is given as the template parameter T
. So instead of using named references which are members of the Data
class, this saves memory by populating the references at the point of access (but unfortunately, using ->
and not the .
operator).
In order to test how well the compiler's optimizer eliminates all of the indirection introduced by the use of Proxy
, the code below includes 2 versions of main()
. The #if 1
version uses the ->
and []
operators, and the #if 0
version performs the equivalent set of procedures, but only by directly accessing Data::ar
.
The Nci()
function generates runtime integer values for initializing array elements, which prevents the optimizer from just plugging constant values directly into each std::cout
<<
call.
For gcc 6.2, using -O3, both versions of main()
generate the same assembly (toggle between #if 1
and #if 0
before the first main()
to compare): https://godbolt.org/g/QqRWZb
#include <iostream>
#include <ctime>
template <typename T>
class Proxy {
public:
T &a, &b, &c;
Proxy(T* par) : a(par[0]), b(par[1]), c(par[2]) {}
Proxy* operator -> () { return this; }
};
struct Data {
int ar[3];
template <typename I> int& operator [] (I idx) { return ar[idx]; }
template <typename I> const int& operator [] (I idx) const { return ar[idx]; }
Proxy<int> operator -> () { return Proxy<int>(ar); }
Proxy<const int> operator -> () const { return Proxy<const int>(ar); }
int* begin() { return ar; }
const int* begin() const { return ar; }
int* end() { return ar + sizeof(ar)/sizeof(int); }
const int* end() const { return ar + sizeof(ar)/sizeof(int); }
};
// Nci returns an unpredictible int
inline int Nci() {
static auto t = std::time(nullptr) / 100 * 100;
return static_cast<int>(t++ % 1000);
}
#if 1
int main() {
Data d = {Nci(), Nci(), Nci()};
for(auto v : d) { std::cout << v << ' '; }
std::cout << "\n";
std::cout << d->b << "\n";
d->b = -5;
std::cout << d[1] << "\n";
std::cout << "\n";
const Data cd = {Nci(), Nci(), Nci()};
for(auto v : cd) { std::cout << v << ' '; }
std::cout << "\n";
std::cout << cd->c << "\n";
//cd->c = -5; // error: assignment of read-only location
std::cout << cd[2] << "\n";
}
#else
int main() {
Data d = {Nci(), Nci(), Nci()};
for(auto v : d.ar) { std::cout << v << ' '; }
std::cout << "\n";
std::cout << d.ar[1] << "\n";
d->b = -5;
std::cout << d.ar[1] << "\n";
std::cout << "\n";
const Data cd = {Nci(), Nci(), Nci()};
for(auto v : cd.ar) { std::cout << v << ' '; }
std::cout << "\n";
std::cout << cd.ar[2] << "\n";
//cd.ar[2] = -5;
std::cout << cd.ar[2] << "\n";
}
#endif
In ISO C99/C11, union-based type-punning is legal, so you can use that instead of indexing pointers to non-arrays (see various other answers).
ISO C++ doesn't allow union-based type-punning. GNU C++ does, as an extension, and I think some other compilers that don't support GNU extensions in general do support union type-punning. But that doesn't help you write strictly portable code.
With current versions of gcc and clang, writing a C++ member function using a switch(idx)
to select a member will optimize away for compile-time constant indices, but will produce terrible branchy asm for runtime indices. There's nothing inherently wrong with switch()
for this; this is simply a missed-optimization bug in current compilers. They could compiler Slava' switch() function efficiently.
The solution/workaround to this is to do it the other way: give your class/struct an array member, and write accessor functions to attach names to specific elements.
struct array_data
{
int arr[3];
int &operator[]( unsigned idx ) {
// assert(idx <= 2);
//idx = (idx > 2) ? 2 : idx;
return arr[idx];
}
int &a(){ return arr[0]; } // TODO: const versions
int &b(){ return arr[1]; }
int &c(){ return arr[2]; }
};
We can have a look at the asm output for different use-cases, on the Godbolt compiler explorer. These are complete x86-64 System V functions, with the trailing RET instruction omitted to better show what you'd get when they inline. ARM/MIPS/whatever would be similar.
# asm from g++6.2 -O3
int getb(array_data &d) { return d.b(); }
mov eax, DWORD PTR [rdi+4]
void setc(array_data &d, int val) { d.c() = val; }
mov DWORD PTR [rdi+8], esi
int getidx(array_data &d, int idx) { return d[idx]; }
mov esi, esi # zero-extend to 64-bit
mov eax, DWORD PTR [rdi+rsi*4]
By comparison, @Slava's answer using a switch()
for C++ makes asm like this for a runtime-variable index. (Code in the previous Godbolt link).
int cpp(data *d, int idx) {
return (*d)[idx];
}
# gcc6.2 -O3, using `default: __builtin_unreachable()` to promise the compiler that idx=0..2,
# avoiding an extra cmov for idx=min(idx,2), or an extra branch to a throw, or whatever
cmp esi, 1
je .L6
cmp esi, 2
je .L7
mov eax, DWORD PTR [rdi]
ret
.L6:
mov eax, DWORD PTR [rdi+4]
ret
.L7:
mov eax, DWORD PTR [rdi+8]
ret
This is obviously terrible, compared to the C (or GNU C++) union-based type punning version:
c(type_t*, int):
movsx rsi, esi # sign-extend this time, since I didn't change idx to unsigned here
mov eax, DWORD PTR [rdi+rsi*4]
This is undefined behavior.
There are lots of rules in C++ that attempt to give the compiler some hope of understanding what you are doing, so it can reason about it and optimize it.
There are rules about aliasing (accessing data through two different pointer types), array bounds, etc.
When you have a variable x
, the fact that it isn't a member of an array means that the compiler can assume that no []
based array access can modify it. So it doesn't have to constantly reload the data from memory every time you use it; only if someone could have modified it from its name.
Thus (&thing.a)[1]
can be assumed by the compiler to not refer to thing.b
. It can use this fact to reorder reads and writes to thing.b
, invalidating what you want it to do without invalidating what you actually told it to do.
A classic example of this is casting away const.
const int x = 7;
std::cout << x << '\n';
auto ptr = (int*)&x;
*ptr = 2;
std::cout << *ptr << "!=" << x << '\n';
std::cout << ptr << "==" << &x << '\n';
here you typically get a compiler saying 7 then 2 != 7, and then two identical pointers; despite the fact that ptr
is pointing at x
. The compiler takes the fact that x
is a constant value to not bother reading it when you ask for the value of x
.
But when you take the address of x
, you force it to exist. You then cast away const, and modify it. So the actual location in memory where x
is has been modified, the compiler is free to not actually read it when reading x
!
The compiler may get smart enough to figure out how to even avoid following ptr
to read *ptr
, but often they are not. Feel free to go and use ptr = ptr+argc-1
or somesuch confusion if the optimizer is getting smarter than you.
You can provide a custom operator[]
that gets the right item.
int& operator[](std::size_t);
int const& operator[](std::size_t) const;
having both is useful.