Please consider the following code.
enum type {CONS, ATOM, FUNC, LAMBDA};
typedef struct{
enum type type;
} object;
typedef struct {
enum type type;
Using separate structs violates the strict aliasing rule and is undefined behaviour: http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
Using an embedded struct as in Dean's last example is fine.
To add to Dean's answer, here's something about pointer conversions in general. I forgot what the term is for this, but a pointer to pointer cast performs no conversion (in the same way int to float is). It is simply a reinterpretation of the bits that they point to (all for the compiler's benefit). "Non-destructive conversion" I think it was. The data doesn't change, only how the compiler interprets what is being pointed at.
e.g.,
If ptr
is a pointer to an object
, the compiler knows that there is a field with a particular offset named type
of type enum type
. On the other hand if ptr
is cast to a pointer to a different type, cons_object
, again it will know how to access fields of the cons_object
each with their own offsets in a similar fashion.
To illustrate imagine the memory layout for a cons_object
:
+---+---+---+---+
cons_object *ptr -> | t | y | p | e | enum type
+---+---+---+---+
| c | a | r | | object *
+---+---+---+---+
| c | d | r | | object *
+---+---+---+---+
The type
field has offset 0, car
is 4, cdr
is 8. To access the car field, all the compiler needs to do is add 4
to the pointer to the structure.
If the pointer was cast to a pointer to an object
:
+---+---+---+---+
((object *)ptr) -> | t | y | p | e | enum type
+---+---+---+---+
| c | a | r | |
+---+---+---+---+
| c | d | r | |
+---+---+---+---+
All the compiler needs to know is that there is a field called type
with offset 0. Whatever is in memory is in memory.
The pointers don't even have to be related whatsoever. You can have a pointer to an int
and cast it to a pointer to an cons_object
. If you were to access the car
field, it's just like any ordinary memory access. It has a certain offset from the beginning of the structure. In this case, what is in that memory location is unknown but that's unimportant. To access a field, only the offset is needed and that information is found in the type's definition.
A pointer to an int
points to a block of memory:
+---+---+---+---+
int *ptr -> | i | n | t | | int
+---+---+---+---+
Casted to a cons_object
pointer:
+---+---+---+---+
((cons_object *)ptr) -> | i | n | t | | enum type
+---+---+---+---+
| X | X | X | X | object *
+---+---+---+---+
| X | X | X | X | object *
+---+---+---+---+
This is fine and is a fairly common technique for implementing "object-orientation" in C. Because the memory layout of struct
s is well-defined in C, as long as the two object share the same layout then you can safely cast pointers between them. That is, the offset of the type
member is the same in the object
struct as it is in the cons_object
struct.
In this case, the type
member tells the API whether the object
is a cons_object
or foo_object
or some other kind of object, so you might be see something like this:
void traverse(object *obj)
{
if (obj->type == CONS) {
cons_object *cons = (cons_object *)obj;
traverse(cons->car);
traverse(cons->cdr);
} else if (obj->type == FOO) {
foo_object *foo = (foo_object *)obj;
traverse_foo(foo);
} else ... etc
}
More commonly, I've seem implementations where the "parent" class is defined as the first member of the "child" class, like so:
typedef struct {
enum type type;
} object;
typedef struct {
object parent;
object *car;
object *cdr;
} cons_object;
This works in largely the same way, except you've got a strong gaurantee that the memory layout of the child "classes" will be the same as the parents. That is, if you add a member to the 'base' object
, it'll automatically be picked up by the children and you won't have to manually make sure all of the structures are in sync.