Why is i = v[i++] undefined?

后端 未结 8 1483
生来不讨喜
生来不讨喜 2021-02-01 13:41

From the C++ (C++11) standard, §1.9.15 which discusses ordering of evaluation, is the following code example:

void g(int i, int* v) {
    i = v[i++]; // the beha         


        
相关标签:
8条回答
  • 2021-02-01 13:49

    There two rules.

    The first rule is about multiple writes which give rise to a "write-write hazard": the same object cannot be modified more than once between two sequence points.

    The second rule is about "read-write hazards". It is this: if an object is modified in an expression, and also accessed, then all accesses to its value must be for the purpose of computing the new value.

    Expressions like i++ + i++ and your expression i = v[i++] violate the first rule. They modify an object twice.

    An expression like i + i++ violates the second rule. The subexpression i on the left observes the value of a modified object, without being involved in the calculation of its new value.

    So, i = v[i++] violates a different rule (bad write-write) from i + i++ (bad read-write).


    The rules are too simplistic, which gives rise to classes of puzzling expressions. Consider this:

    p = p->next = q
    

    This appears to have a sane data flow dependency that is free of hazards: the assignment p = cannot take place until the new value is known. The new value is the result of p->next = q. The the value q should not "race ahead" and get inside p, such that p->next is affected.

    Yet, this expression breaks the second rule: p is modified, and also used for a purpose not related to computing its new value, namely determining the storage location where the value of q is placed!

    So, perversely, compilers are allowed to partially evaluate p->next = q to determine that the result is q, and store that into p, and then go back and complete the p->next = assignment. Or so it would seem.

    A key issue here is, what is the value of an assignment expression? The C standard says that the value of an assignment expression is that of the lvalue, after the assignment. But that is ambiguous: it could be interpreted as meaning "the value which the lvalue will have, once the assignment takes place" or as "the value which can be observed in the lvalue after the assignment has taken place". In C++ this is made clear by the wording "[i]n all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.", so p = p->next = q appears to be valid C++, but dubious C.

    0 讨论(0)
  • 2021-02-01 13:50

    In this example I would think that the subexpression i++ would be completely evaluated before the subexpression v[...] is evaluated, and that the result of evaluation of the subexpression is i (before the increment), but that the value of i is the incremented value after that subexpression has been completely evaluated.

    The increment in i++ must be evaluated before indexing v and thus before assigning to i, but storing the value of that increment back to memory need not happen before. In the statement i = v[i++] there are two suboperations that modify i (i.e. will end up causing a store from a register into the variable i). The expression i++ is equivalent to x=i+1, i=x, and there is no requirement that both operations need to take place sequentially:

    x = i+1;
    y = v[i];
    i = y;
    i = x;
    

    With that expansion, the result of i is unrelated to the value in v[i]. On a different expansion, the i = x assignment could take place before the i = y assignment, and the result would be i = v[i]

    0 讨论(0)
  • 2021-02-01 13:57

    The reason is not just historical. Example:

    int f(int& i0, int& i1) {
        return i0 + i1++;
    }
    

    Now, what happens with this call:

    int i = 3;
    int j = f(i, i);
    

    It's certainly possible to put requirements on the code in f so that the result of this call is well defined (Java does this), but C and C++ don't impose constraints; this gives more freedom to optimizers.

    0 讨论(0)
  • 2021-02-01 13:59

    I would share your arguments if the example were v[++i], but since i++ modifies i as a side-effect, it is undefined as to when the value is modified. The standard could probably mandate a result one way or the other, but there's no true way of knowing what the value of i should be: (i + 1) or (v[i + 1]).

    0 讨论(0)
  • 2021-02-01 14:05

    I would think that the subexpression i++ would be completely evaluated before the subexpression v[...] is evaluated

    But why would you think that?

    One historical reason for this code being UB is to allow compiler optimizations to move side-effects around anywhere between sequence points. The fewer sequence points, the more potential opportunities to optimize but the more confused programmers. If the code says:

    a = v[i++];
    

    The intention of the standard is that the code emitted can be:

    a = v[i];
    ++i;
    

    which might be two instructions where:

    tmp = i;
    ++i;
    a = v[tmp];
    

    would be more than two.

    The "optimized code" breaks when a is i, but the standard permits the optimization anyway, by saying that behavior of the original code is undefined when a is i.

    The standard easily could say that i++ must be evaluated before the assignment as you suggest. Then the behavior would be fully defined and the optimization would be forbidden. But that's not how C and C++ do business.

    Also beware that many examples raised in these discussions make it easier to tell that there's UB around than it is in general. This leads to people saying that it's "obvious" the behavior should be defined and the optimization forbidden. But consider:

    void g(int *i, int* v, int *dst) {
        *dst = v[(*i)++];
    }
    

    The behavior of this function is defined when i != dst, and in that case you'd want all the optimization you can get (which is why C99 introduces restrict, to allow more optimizations than C89 or C++ do). In order to give you the optimization, behavior is undefined when i == dst. The C and C++ standards tread a fine line when it comes to aliasing, between undefined behavior that's not expected by the programmer, and forbidding desirable optimizations that fail in certain cases. The number of questions about it on SO suggests that the questioners would prefer a bit less optimization and a bit more defined behavior, but it's still not simple to draw the line.

    Aside from whether the behavior is fully defined is the issue of whether it should be UB, or merely unspecified order of execution of certain well-defined operations corresponding to the sub-expressions. The reason C goes for UB is all to do with the idea of sequence points, and the fact that the compiler need not actually have a notion of the value of a modified object, until the next sequence point. So rather than constrain the optimizer by saying that "the" value changes at some unspecified point, the standard just says (to paraphrase): (1) any code that relies on the value of a modified object prior to the next sequence point, has UB; (2) any code that modifies a modified object has UB. Where a "modified object" is any object that would have been modified since the last sequence point in one or more of the legal orders of evaluation of the subexpressions.

    Other languages (e.g. Java) go the whole way and completely define the order of expression side-effects, so there's definitely a case against C's approach. C++ just doesn't accept that case.

    0 讨论(0)
  • 2021-02-01 14:06

    I'm going to design a pathological computer1. It is a multi-core, high-latency, single-thread system with in-thread joins that operates with byte-level instructions. So you make a request for something to happen, then the computer runs (in its own "thread" or "task") a byte-level set of instructions, and a certain number of cycles later the operation is complete.

    Meanwhile, the main thread of execution continues:

    void foo(int v[], int i){
      i = v[i++];
    }
    

    becomes in pseudo-code:

    input variable i // = 0x00000000
    input variable v // = &[0xBAADF00D, 0xABABABABAB, 0x10101010]
    task get_i_value: GET_VAR_VALUE<int>(i)
    reg indx = WAIT(get_i_value)
    task write_i++_back: WRITE(i, INC(indx))
    task get_v_value: GET_VAR_VALUE<int*>(v)
    reg arr = WAIT(get_v_value)
    task get_v[i]_value = CALC(arr + sizeof(int)*indx)
    reg pval = WAIT(get_v[i]_value)
    task read_v[i]_value = LOAD_VALUE<int>(pval)
    reg got_value = WAIT(read_v[i]_value)
    task write_i_value_again = WRITE(i, got_value)
    (discard, discard) = WAIT(write_i++_back, write_i_value_again)
    

    So you'll notice that I didn't wait on write_i++_back until the very end, the same time as I was waiting on write_i_value_again (which value I loaded from v[]). And, in fact, those writes are the only writes back to memory.

    Imagine if write to memory are the really slow part of this computer design, and they get batched up into a queue of things that get processed by a parallel memory modifying unit that does things on a per-byte basis.

    So the write(i, 0x00000001) and write(i, 0xBAADF00D) execute unordered and in parallel. Each gets turned into byte-level writes, and they are randomly ordered.

    We end up writing 0x00 then 0xBA to the high byte, then 0xAD and 0x00 to the next byte, then 0xF0 0x00 to the next byte, and finally 0x0D 0x01 to the low byte. The resulting value in i is 0xBA000001, which few would expect, yet would be a valid result to your undefined operation.

    Now, all I did there was result in an unspecified value. We haven't crashed the system. But the compiler would be free to make it completely undefined -- maybe sending two such requests to the memory controller for the same address in the same batch of instructions actually crashes the system. That would still be a "valid" way to compile C++, and a "valid" execution environment.

    Remember, this is a language where restricting the size of pointers to 8 bits is still a valid execution environment. C++ allows for compiling to rather wonkey targets.

    1: As noted in @SteveJessop's comment below, the joke is that this pathological computer behaves a lot like a modern desktop computer, until you get down to the byte-level operations. Non-atomic int writing by a CPU isn't all that rare on some hardware (such as when the int isn't aligned the way the CPU wants it to be aligned).

    0 讨论(0)
提交回复
热议问题