Sequence Points between printf function args; does the sequence point between conversions matter?

后端 未结 3 1351
轻奢々
轻奢々 2020-12-10 14:24

I read here that there is a sequence point:

After the action associated with input/output conversion format specifier. For example, in the expression

相关标签:
3条回答
  • 2020-12-10 14:29

    I think you misunderstood the text about the printf sequence points (SP). They are somehow an anomaly, and only with %n because this format specifier has side effects, and those side effects need to be sequenced.

    Anyway, there is a SP at the beginning of the execution of printf() and after the evaluation of all the arguments. Those format-specifier SP are all after this one so they don't affect your problem.

    In your example, the uses of i are all in function arguments, and none of them are separated with sequence points. Since you modify the value (twice) and use the value without intervening sequence points, your code is UB.

    What the rule about the SP in printf means is that this code is well formed:

    int x;
    printf("%d %n %d %n\n", 1, &x, 2, &x);
    

    even though the value of x is modified twice.

    But this code is UB:

    int x = 1;
    printf("%d %d\n", x, ++x);
    

    NOTE: Remember that %n means that the number of characters written so far is copied to the integer pointed by the associated argument.

    0 讨论(0)
  • 2020-12-10 14:30

    Because this question was asked because of a comment-based discussion here, I'll provide some context:

    first comment: The order of operations is not guaranteed to be the order in which you pass arguments to the function. Some people (wrongly) assume that the arguments will be evaluated right to left, but according to the standard, the behaviour is undefined.

    The OP accepts and understands this. No point in repeating the fact that your_function(++i, ++i) is UB.

    In response to that comment: Thanks to your comment I see that printf may be evaluated in any order, but I understood that to be because printf arguments are part of a va_list. Are you saying that the arguments to any function are executed in an arbitrary order?

    OP asking for clarification, so I elaborated a bit:

    Second comment: Yes, that's exactly what I'm saying. even calling int your_function(int a, int b) { return a - b; } does not guarantee that the expressions you pass will be evaluated left to right. There's no sequence point (a point at which all side effects of previous evaluations are performed). Take this example. The nested call is a sequence point, so the outer call passes i+1 (13), and the return value of the inner call (undefined, in this case -1 because i++, i evaluates to 12, 13 apparently), but there's no guarantee that this will always be the case

    That made it pretty clear that these kinds of constructs trigger UB for all functions.


    Wikipedia confusion

    OP Quotes this:

    After the action associated with input/output conversion format specifier. For example, in the expression printf("foo %n %d", &a, 42), there is a sequence point after the %n is evaluated before printing 42.

    Then applies it to his snippet (prinf("%d - %d - %d\n", i, your_function(++i, ++i), i);) expeciting the format specifiers to serve as sequence points.
    What is being referred to by saying "input/output conversion format specifier" is the %n specifier. The corresponding argument must be a pointer to an unsigned integer, and it will be assigned the number of characters printed thus far. Naturally, %n must be evaluated before the rest of the arguments are printed. However, using the pointer passed for %n in other arguments is still dangerous: it's not UB (well, it isn't, but it can be):

    printf("Foo %n %*s\n", &a, 100-a, "Bar");//DANGER!!
    

    There is a sequence point before the function is called, so the expression 100-a will be evaluated before %n has set &a to the correct value. If a is uninitialized, then 100-a is UB. If a is initialized to 0, for example, the result of the expression will be 100. On the whole, though, this kind of code is pretty much asking for trouble. Treat it as very bad practice, or worse...
    Just look at the output generated by either one of these statements:

    unsigned int a = 90;
    printf("%u %n %*s\n",a,  &a, 10, "Bar");//90         Bar
    printf("%u\n", a);//3
    printf("Foo %u %n %*s\n",a, &a, 10-a, "Bar");//Foo 3      Bar < padding used: 10 - 3, not 10 - 6 
    printf("%u\n", a);//6
    

    In as you can see, n gets reassigned inside of printf, so you can't use its new value in the argument list (because there's a sequence point). If you expect n to be reassigned "in-place" you're essentially expecting C to jump out of the function call, evaluate other arguments, and jump back into the call. That's just not possible. If you were to change unsigned int a = 90; to unsigned int a;, then the behaviour is undefined.


    Concerning the 12's

    Now because the OP read up on sequence points, he correctly notices that this statement:

    printf("%d - %d - %d\n", i, your_function(++i, ++i), i);
    

    Is slightly different: your_function(++i, ++i) is a sequence point, and guarantees that i will be incremented twice. This function call is a sequence point because:

    Before a function is entered in a function call. The order in which the arguments are evaluated is not specified, but this sequence point means that all of their side effects are complete before the function is entered

    That means that, before printf is called, your_function has to be called (because its return value is one of the arguments for the printf call), and i will be incremented twice.
    This could explain the output being "12 - 0 - 12", but is it guaranteed to be the output?

    No

    Technically, although most compilers will evaluate the your_function(++i, ++i); call first, the standard would allow a compiler to evaluate the arguments passed to sprintf left to right (the order isn't specified after all). So this would be an equally valid result:

    10 - 0 - 12
    //or even
    12 - 0 - 10
    //and
    10 - 0 - 10
    //technically, even this would be valid
    12 - 0 - 11
    

    Although the latter output is extremely unlikely (it'd be very inefficient)

    0 讨论(0)
  • 2020-12-10 14:33

    Arriving at a clear answer to this question is strongly effected (even prevented) by the C rules on order of evaluation and UB.

    The specified rules on order of evaluation are stated here:

    C99 section 6.7.9, p23: 23 The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.

    And, this function call will exhibit undefined behavior:

    your_function(++i, ++i)
    

    Because of UB, coupled with the rules on order of evaluation, accurate predictions on the expected outcomes for the following:

    printf("%d - %d - %d\n", i, your_function(++i, ++i), i);
    

    are impossible.

    Edit
    ...I'm not asking about why my middle term is 0. I'm asking why the other two terms are both 12.

    There is no guarantee which of the three arguments of the above function are called first. (because of the C's rules on order of evaluation). And if the middle function gets evaluated first, then at that point you have invoked Undefined Behavior . Who can really say why the other two terms are 12?. Because what happens to i when the second argument is evaluated is anyone's guess.

    0 讨论(0)
提交回复
热议问题