问题
Today I came across some code that exhibits different behavior on clang++ (3.7-git), g++ (4.9.2) and Visual Studio 2013. After some reduction I came up with this snippet which highlights the issue:
#include <iostream>
using namespace std;
int len_ = -1;
char *buffer(int size_)
{
cout << "len_: " << len_ << endl;
return new char[size_];
}
int main(int argc, char *argv[])
{
int len = 10;
buffer(len+1)[len_ = len] = '\0';
cout << "len_: " << len_ << endl;
}
g++ (4.9.2) gives this output:
len_: -1
len_: 10
So g++ evaluates the argument to buffer, then buffer(..) itself and after that it evaluates the index argument to the array operator. Intuitively this makes sense to me.
clang (3.7-git) and Visual Studio 2013 both give:
len_: 10
len_: 10
I suppose clang and VS2013 evaluates everything possible before it decends into buffer(..). This makes less intuitive sense to me.
I guess the gist of my question is whether or not this is a clear case of undefined behavior.
Edit: Thanks for clearing this up, and unspecified behavior is the term I should have used.
回答1:
This is unspecified behavior, len_ = len
is indeterminately sequenced with respect to the execution of the body of buffer()
, which means that one will be executed before the other but it is not specified which order but there is an ordering so evaluations can not overlap therefore no undefined behavior. This means gcc
, clang
and Visual Studio
are all correct. On other hand unsequenced evaluations allow for overlapping evaluations which can lead to undefined behavior as noted below.
From the draft C++11 standard section 1.9
[intro.execution]:
[...]Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.9[...]
and indeterminately sequenced is covered a little before this and says:
[...]Evaluations A and B are indeterminately sequenced when either A is sequenced before B or B is sequenced before A, but it is unspecified which. [ Note: Indeterminately sequenced evaluations cannot overlap, but either could be executed first. —end note ]
which is different than unsequenced evaluations:
[...]If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced. [ Note: The execution of unsequenced evaluations can overlap. —end note ][...]
which can lead to undefined behavior (emphasis mine):
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined[...]
Pre C++11
Pre C++11 the order of evaluation of sub-expressions is also unspecified but it uses sequence points as opposed to ordering. In this case, there is a sequence point at function entry and function exit which ensures there is no undefined behavior. From section 1.9
:
[...]The sequence points at function-entry and function-exit (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be.
Nailing down order of evaluation
The different choices made by each compiler may seem unintuitive depending on your perspective and expectations. The subject of nailing down order of evaluation is the subject of EWG issue 158: N4228 Refining Expression Evaluation Order for Idiomatic C++, which is being considered for C++17 but seems controversial based on the reactions to a poll on the subject. The paper covers a much more complicated case from "The C++ Programming Language" 4th edition. Which shows even those with deep experience in C++ can get tripped up.
回答2:
Well, no, it's not a case of undefined behaviour. It is a case of unspecified behaviour.
It is unspecified whether the expression len_ = len
will be evaluated before or after buffer(len+1)
. From the output you have described, g++ evaluates buffer(len+1)
first, and clang evaluates len_ = len
first.
Both possibilities are correct, since the order of evaluation of those two sub-expressions is unspecified . Both expressions will be evaluated (so the behaviour does not qualify as being undefined) but the standard does not specify the order.
来源:https://stackoverflow.com/questions/29513572/sequence-point-ambiguity-undefined-behavior