Have a look at is simple example:
struct Base { /* some virtual functions here */ };
struct A: Base { /* members, overridden virtual functions */ };
struct B: Ba
If A
and B
are a verbatim copy of each other (except for their names) and are declared in the same context (same namespace, same #defines, no __LINE__
usage), then common C++ compilers (gcc
, clang
) will produce two binary representations which are fully interchangeable.
If A
and B
use the same method signatures but the bodies of corresponding methods differ, it is unsafe to cast A*
to B*
because the optimization pass in the compiler could for example partially inline the body of void B::method()
at the call site b->method()
while the programmer's assumption could be that b->method()
will call A::method()
. Therefore, as soon as the programmer uses an optimizing compiler the behavior of accessing A
through type B*
becomes undefined.
Problem: All compilers are always at least to some extent "optimizing" the source code passed to them, even at -O0
. In cases of behavior not mandated by the C++ standard (that is: undefined behavior), the compiler's implicit assumptions - when all optimizations are turned off - might differ from programmer's assumptions. The implicit assumptions have been made by the developers of the compiler.
Conclusion: If the programmer is able to avoid using an optimizing compiler then it is safe to access A
via B*
. The only issue such a programmer needs to tackle with is that non-optimizing compilers do not exist.
A managed C++ implementation might abort the program when A*
is casted to B*
via reinterpret_cast
, when b->field
is accessed, or when b->method()
is called. Some other managed C++ implementation might try harder to avoid a program crash and so it will resort to temporary duck typing when it sees the program accessing A
via B*
.
Some questions are:
Yes, It does have undefined behavior. The layout about suboject of Base in A and B is undefined. x may be not a real Base oject.
static_cast
(or an implicit derived-to-base-pointer conversion, which does exactly the same thing) is substantially different from reinterpret_cast
. There is no guarantee that that the base subobject starts at the same address as the complete object.
Most implementations place the first base subobject at the same address as the complete object, but of course even such implementations cannot place two different non-empty base subobjects at the same address. (An object with virtual functions is not empty). When the base subobject is not at the same address as the complete object, static_cast
is not a no-op, it involves pointer adjustment.
There are implementations that never place even the first base subobject at the same address as the complete object. It is allowed to place the base subobject after all members of derived, for example. IIRC the Sun C++ compiler used to layout classes this way (don't know if it's still doing that). On such an implementation, this code is almost guaranteed to fail.
Similar code with B having more than one base will fail on many implementations. Example.
The reinterpret_cast
is valid (the result can be dereferenced) if the two classes are layout-compatible; that is
But the classes do not have standard layout because one of the requirements of StandardLayoutType it that the class has no virtual functions or virtual base classes.
Regarding the validity of pointers derived from conversions, the standard has this to say in the section on "Safely-derived pointers":
6.7.4.3 Safely-derived pointers
4. An implementation may have relaxed pointer safety, in which case the validity of a pointer value does not depend on whether it is a safely-derived pointer value. Alternatively, an implementation may have strict pointer safety, in which case a pointer value referring to an object with dynamic storage duration that is not a safely-derived pointer value is an invalid pointer value unless the referenced complete object has previously been declared reachable. [ Note: The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined, see 6.7.4.2. This is true even if the unsafely-derived pointer value might compare equal to some safely-derived pointer value. —end note ] It is implementation-defined whether an implementation has relaxed or strict pointer safety.