Is this code well-defined?

对着背影说爱祢 提交于 2019-11-27 13:30:49
Johannes Schaub - litb

This depends on how Sun is defined. The following is well-defined

struct A {
  A &Fun(int);
  A &Gun(int);
  A &Sun(int&);
  A &Tun();
};

void g() {
  A someInstance;
  int k = 0;
  someInstance.Fun(++k).Gun(10).Sun(k).Tun();
}

If you change the parameter type of Sun to int, it becomes undefined. Let's draw a tree of the version taking an int.

                     <eval body of Fun>
                             |
                             % // pre-call sequence point
                             | 
 { S(increment, k) }  <-  E(++x) 
                             |     
                      E(Fun(++k).Gun(10))
                             |
                      .------+-----.       .-- V(k)--%--<eval body of Sun>
                     /              \     /
                   E(Fun(++k).Gun(10).Sun(k))
                              |
                    .---------+---------. 
                   /                     \ 
                 E(Fun(++k).Gun(10).Sun(k).Tun())
                              |
                              % // full-expression sequence point

As can be seen, we have a read of k (designated by V(k)) and a side-effect on k (at the very top) that are not separated by a sequence point: In this expression, relative to each other sub-expression, there is no sequence point at all. The very bottom % signifies the full-expression sequence point.

I think if you read exactly what that standard quote says, the first case won't be well-defined:

When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body

What this tells us is not that "the only thing that can happen after the arguments for a function have been evaluated is the actual function call", but simply that there is a sequence point at some point after the evaluation of arguments finishes, and before the function call.

But if you imagine a case like this:

foo(X).bar(Y)

the only guarantee this gives us is that:

  • X is evaluated before the call to foo, and
  • Y is evaluated before the call to bar.

But an order such as this would still be possible:

  1. evaluate X
  2. evalute Y
  3. (sequence point separating X from foo call)
  4. call foo
  5. (sequence point separating Y from bar call)
  6. call bar

and of course, we could also swap around the first two items, evaluating Y before X. Why not? The standard only requires that the arguments for a function are fully evaluated before the first statement of the function body, and the above sequences satisfy that requirement.

That's my interpretation, at least. It doesn't seem to say that nothing else may occur between argument evaluation and function body -- just that those two are separated by a sequence point.

David Gelhar

This is undefined behavior, because the value of k is being both modified and read in the same expression, without an intervening sequence point. See the excellent long answer to this question.

The quote from 1.9.17 tells you that all function arguments are evaluated before the body of the function is called, but doesn't say anything about the relative order of evaluation of arguments to different function calls within the same expression -- no guarantee that "++k Fun() is evaluated before k in Sun()".

eat(++k);drink(10);sleep(k);

is different because the ; is a sequence point, so the order of evaluation is well-defined.

As a little test, consider:

#include <iostream>

struct X
{
    const X& f(int n) const
    {
        std::cout << n << '\n';
        return *this;
    }
};

int main()
{
    int n = 1;

    X x;

    x.f(++n).f(++n).f(++n).f(++n);
}

I run this with gcc 3.4.6 and no optimisation and get:

5
4
3
2

...with -O3...

2
3
4
5

So, either that version of 3.4.6 had a major bug (which is a bit hard to believe), or the sequence is undefined as Philip Potter suggested. (GCC 4.1.1 with/without -O3 produced 5, 5, 5, 5.)

EDIT - my summary of the discussion in comments below:

  • 3.4.6 really might have had a bug (well, yes)
  • many newer compilers happen to produce 5/5/5/5... is that a defined behaviour?
    • probably not, as it corresponds to all increment side effects being "actioned" before any of the function calls are made, which is not a behaviour that anyone here has suggested could be guaranteed by the Standard
  • this isn't a very good approach to investigating the Standard's requirements (particularly with an older compiler like 3.4.6): agreed, but it's a useful sanity check

I know that the behavior of compilers cannot really prove anything, but I thought it would be interesting to check out what the internal representation of a compiler would give (still a bit higher level than assembly inspection).

I've used the Clang/LLVM online demo with this code:

#include <stdio.h>
#include <stdlib.h>

struct X
{
  X const& f(int i) const
  {
    printf("%d\n", i);
    return *this;
  }
};

int main(int argc, char **argv) {
  int i = 0;
  X x;
  x.f(++i).f(++i).f(++i);         // line 16
}

And compiled with the standard optimizations (in C++ mode), it gave:

/tmp/webcompile/_13371_0.cc: In function 'int main(int, char**)':
/tmp/webcompile/_13371_0.cc:16: warning: operation on 'i' may be undefined

which I did find interesting (did any other compiler warned about this ? Comeau online did not)


As an aside it also produced the following Intermediate Representation (scroll to the right):

@.str = private constant [4 x i8] c"%d\0A\00", align 1 ; <[4 x i8]*> [#uses=1]

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind {
entry:
  %0 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 3) nounwind ; <i32> [#uses=0]
                                                                                                             ^^^^^
  %1 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 3) nounwind ; <i32> [#uses=0]
                                                                                                             ^^^^^
  %2 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 3) nounwind ; <i32> [#uses=0]
                                                                                                             ^^^^^
  ret i32 0
}

Apparently, Clang behaves like gcc 4.x.x does and first evaluates all arguments before performing any function call.

The second case is certainly well-defined. A string of tokens that ends with a semicolon is an atomic statement in C++. Each statement is parsed, processed and completed before the next statement is begun.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!