Can unsafe type punning be fixed by marking a variable volatile?

问题

In zwol's answer to Is it legal to implement inheritance in C by casting pointers between one struct that is a subset of another rather than first member? he gives an example of why a simple typecast between similar structs isn't safe, and in the comments there is a sample environment in which it behaves unexpectedly: compiling the following with gcc on -O2 causes it to print "x=1.000000 some=2.000000"

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

struct base
{
    double some;
    char space_for_subclasses[];
};
struct derived
{
    double some;
    int value;
};

double test(struct base *a, struct derived *b)
{
    a->some = 1.0;
    b->some = 2.0;
    return a->some;
}

int main(void)
{
    size_t bufsz = sizeof(struct base);
    if (bufsz < sizeof(struct derived)) bufsz = sizeof(struct derived);
    void *block = malloc(bufsz);

    double x = test(block, block);
    printf("x=%f some=%f\n", x, *(double *)block);
    return 0;
}

I was fooling around with the code to better understand exactly how it behaves because I need to do something similar, and noticed that marking a as volatile was enough to prevent it from printing different values. This lines up with my expectations as to what is going wrong - gcc is assuming that a->some is unaffected by the write to b->some. However, I would have thought gcc could only assume this if a or b were marked with restrict.

Am I misunderstanding what is happening here and/or the meaning of the restrict qualifier? If not, is gcc free to make this assumption because a and b are of different types? Finally, does marking both a and b as volatile make this code compliant with the standard, or at least prevent the undefined behaviour from allowing gcc to make the aforementioned assumption?

回答1:

If a region of storage is accessed exclusively using volatile-qualified lvalues, a compiler would have to go extremely far out of its way not to process every write as translating the values written to a pattern of bits and storing it, and every read as reading a bit pattern from memory and translating it into a value. The Standard does not actually mandate such behavior, and in theory a compiler given:

long long volatile foo;
...
int test(void)
{
  return *((short volatile*)(&foo));
}

could assume that any code branch that could call test will never be executed, but I don't yet know of any compilers that behave in such extreme fashion.

On the other hand, given a function like the following:

void zero_aligned_pair_of_shorts(uint16_t *p)
{
  *((uint32_t void volatile*)&p) = 0;
}

compilers like gcc and clang will not reliably recognize that it might have some effect upon the stored value of an object which is accessed using an unqualified lvalue of type uint16_t. Some compilers like icc regard volatile accesses as an indicator to synchronize any register-cached objects whose address has been taken, because doing so it a cheap and easy way for compilers to uphold the Spirit of C principle described in the Standards' charter and rationale documents as "Don't prevent the programmer from doing what needs to be done" without requiring special syntax. Other compilers like gcc and clang, however, require that programmers either use gcc/clang-specific intrinsics or else use command-line options to globally block most forms of register caching.

回答2:

The problem with this particular question and zwol's answer is that they conflate type punning and strict aliasing. Zwol's answer is correct for that particular use case, because of the type used to initialize the structure; but not in the general case, nor wrt. struct sockaddr POSIX types as one might read the answer to imply.

For type punning between structure types with common initial members, all you need to do is to declare (not use!) an union of those structures, and you can safely access the common members through a pointer of any of the structure types. This is the explicitly allowed behaviour since ANSI C 3.3.2.3, including C11 6.5.2.3p6 (link to n1570 draft).

If an implementation contains an union of all struct sockaddr_ structures visible to userspace applications, zwol's answer OP links to is misleading, in my opinion, if one reads it to imply that struct sockaddr structure support requires something nonstandard from compilers. (If you define _GNU_SOURCE, glibc defines such an union as struct __SOCKADDR_ARG containing an anonymous union of all such types. However, glibc is designed to be compiled using GCC, so it could have other issues.)

Strict aliasing is a requirement that the parameters to a function do not refer to the same storage (memory). As an example, if you have

int   i = 0;
char *iptr = (char *)(&i);

int modify(int *iptr, char *cptr)
{
    *cptr = 1;
    return *iptr;
}

then calling modify(&i, iptr) is a strict aliasing violation. The type punning in the definition of iptr is incidental, and is actually allowed (because you are allowed to use the char type to examine the storage representation of any type; C11 6.2.6.1p4).

Here is a proper example of type punning, avoiding strict aliasing issues:

struct item {
    struct item *next;
    int          type;
};

struct item_int {
    struct item *next;
    int          type; /* == ITEMTYPE_INT */
    int          value;
};

struct item_double {
    struct item *next;
    int          type; /* == ITEMTYPE_DOUBLE */
    double       value;
};

struct item_string {
    struct item *next;
    int          type;    /* == ITEMTYPE_STRING */
    size_t       length;  /* Excluding the '\0' */
    char         value[]; /* Always has a terminating '\0' */
};

enum {
    ITEMTYPE_UNKNOWN = 0,
    ITEMTYPE_INT,
    ITEMTYPE_DOUBLE,
    ITEMTYPE_STRING,
};

Now, if in the same scope the following union is visible, we can type-pun between pointers to the above structure types, and access the next and type members, completely safely:

union item_types {
    struct item         any;
    struct item_int     i;
    struct item_double  d;
    struct item_string  s;
};

For the other (non-common) members, we must use the same structure type that was used to initialize the structure. That is why the type field exists.

As an example of such a completely safe usage, consider the following function that prints the values in a list of items:

void print_items(const struct item *list, FILE *out)
{
    const char *separator = NULL;

    fputs("{", out);        

    while (list) {

        if (separator)
            fputs(separator, out);
        else
            separator = ",";

        if (list->type == ITEMTYPE_INT)
            fprintf(out, " %d", ((const struct item_int *)list)->value);
        else
        if (list->type == ITEMTYPE_DOUBLE)
            fprintf(out, " %f", ((const struct item_double *)list)->value);
        else
        if (list->type == ITEMTYPE_STRING)
            fprintf(out, " \"%s\"", ((const struct item_string *)list)->value);
        else
            fprintf(out, " (invalid)");

        list = list->next;
    }

    fputs(" }\n", out);
}

Note that I used the same name value for the value field, just because I didn't think of any better one; they do not need to be the same.

The type-punning occurs in the fprintf() statements, and are valid if and only if 1) the structures were initialized using structures matching the type field, and 2) the union item_types is visible in the current scope.

None of the current C compilers I've tried have any issues with the above code, even at extreme optimization levels that break some facets of standard behaviour. (I haven't checked MSVC, but that one is really a C++ compiler, that can also compile most C code. I would be surprised, however, if it had any issues with the above code.)

来源：https://stackoverflow.com/questions/53699251/can-unsafe-type-punning-be-fixed-by-marking-a-variable-volatile

标签

undefined-behavior

volatile

type-punning

restrict-qualifier