What are the implications of the voted in C++17 evaluation order guarantees (P0145) on typical C++ code?
What does it change about things like the following?
Some common cases where the evaluation order has so far been unspecified, are specified and valid with C++17
. Some undefined behaviour is now instead unspecified.
i = 1; f(i++, i)
was undefined, but it is now unspecified. Specifically, what is not specified is the order in which each argument to f
is evaluated relative to the others. i++
might be evaluated before i
, or vice-versa. Indeed, it might evaluate a second call in a different order, despite being under the same compiler.
However, the evaluation of each argument is required to execute completely, with all side-effects, before the execution of any other argument. So you might get f(1, 1)
(second argument evaluated first) or f(1, 2)
(first argument evaluated first). But you will never get f(2, 2)
or anything else of that nature.
std::cout << f() << f() << f();
was unspecified, but it will become compatible with operator precedence so that the first evaluation of f
will come first in the stream (examples below).
f(g(), h(), j());
still has unspecified evaluation order of g, h, and j. Note that for getf()(g(),h(),j())
, the rules state that getf()
will be evaluated before g, h, j
.
Also note the following example from the proposal text:
std::string s = "but I have heard it works even if you don't believe in it" s.replace(0, 4, "").replace(s.find("even"), 4, "only") .replace(s.find(" don't"), 6, "");
The example comes from The C++ Programming Language, 4th edition, Stroustrup, and used to be unspecified behaviour, but with C++17 it will work as expected. There were similar issues with resumable functions (.then( . . . )
).
As another example, consider the following:
#include <iostream>
#include <string>
#include <vector>
#include <cassert>
struct Speaker{
int i =0;
Speaker(std::vector<std::string> words) :words(words) {}
std::vector<std::string> words;
std::string operator()(){
assert(words.size()>0);
if(i==words.size()) i=0;
// Pre-C++17 version:
auto word = words[i] + (i+1==words.size()?"\n":",");
++i;
return word;
// Still not possible with C++17:
// return words[i++] + (i==words.size()?"\n":",");
}
};
int main() {
auto spk = Speaker{{"All", "Work", "and", "no", "play"}};
std::cout << spk() << spk() << spk() << spk() << spk() ;
}
With C++14 and before we may (and will) get results such as
play
no,and,Work,All,
instead of
All,work,and,no,play
Note that the above is in effect the same as
(((((std::cout << spk()) << spk()) << spk()) << spk()) << spk()) ;
But still, before C++17 there was no guarantee that the first calls would come first into the stream.
References: From the accepted proposal:
Postfix expressions are evaluated from left to right. This includes functions calls and member selection expressions.
Assignment expressions are evaluated from right to left. This includes compound assignments.
Operands to shift operators are evaluated from left to right. In summary, the following expressions are evaluated in the order a, then b, then c, then d:
- a.b
- a->b
- a->*b
- a(b1, b2, b3)
- b @= a
- a[b]
- a << b
- a >> b
Furthermore, we suggest the following additional rule: the order of evaluation of an expression involving an overloaded operator is determined by the order associated with the corresponding built-in operator, not the rules for function calls.
Edit note: My original answer misinterpreted a(b1, b2, b3)
. The order of b1
, b2
, b3
is still unspecified. (thank you @KABoissonneault, all commenters.)
However, (as @Yakk points out) and this is important: Even when b1
, b2
, b3
are non-trivial expressions, each of them are completely evaluated and tied to the respective function parameter before the other ones are started to be evaluated. The standard states this like this:
§5.2.2 - Function call 5.2.2.4:
. . . The postfix-expression is sequenced before each expression in the expression-list and any default argument. Every value computation and side effect associated with the initialization of a parameter, and the initialization itself, is sequenced before every value computation and side effect associated with the initialization of any subsequent parameter.
However, one of these new sentences are missing from the GitHub draft:
Every value computation and side effect associated with the initialization of a parameter, and the initialization itself, is sequenced before every value computation and side effect associated with the initialization of any subsequent parameter.
The example is there. It solves a decades-old problems (as explained by Herb Sutter) with exception safety where things like
f(std::unique_ptr<A> a, std::unique_ptr<B> b);
f(get_raw_a(), get_raw_a());
would leak if one of the calls get_raw_a()
would throw before the other
raw pointer was tied to its smart pointer parameter.
As pointed out by T.C., the example is flawed since unique_ptr construction from raw pointer is explicit, preventing this from compiling.*
Also note this classical question (tagged C, not C++):
int x=0; x++ + ++x;
is still undefined.
In C++14, the following was unsafe:
void foo(std::unique_ptr<A>, std::unique_ptr<B>);
foo(std::unique_ptr<A>(new A), std::unique_ptr<B>(new B));
There are four operations that happen here during the function call
new A
unique_ptr<A>
constructornew B
unique_ptr<B>
constructorThe ordering of these was completely unspecified, and so a perfectly valid ordering is (1), (3), (2), (4). If this ordering was selected and (3) throws, then the memory from (1) leaks - we haven't run (2) yet, which would've prevented the leak.
In C++17, the new rules prohibit interleaving. From [intro.execution]:
For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A.
There is a footnote to that sentence which reads:
In other words, function executions do not interleave with each other.
This leaves us with two valid orderings: (1), (2), (3), (4) or (3), (4), (1), (2). It is unspecified which ordering is taken, but both of these are safe. All the orderings where (1) (3) both happen before (2) and (4) are now prohibited.
I've found some notes about expression evaluation order:
Some order of evaluation guarantees surrounding overloaded operators and complete-argument rules where added in C++17. But it remains that which argument goes first is left unspecified. In C++17, it is now specified that the expression giving what to call (the code on the left of the ( of the function call) goes before the arguments, and whichever argument is evaluated first is evaluated fully before the next one is started, and in the case of an object method the value of the object is evaluated before the arguments to the method are.
21) Every expression in a comma-separated list of expressions in a parenthesized initializer is evaluated as if for a function call (indeterminately-sequenced)
The C++ language does not guarantee the order in which arguments to a function call are evaluated.
In P0145R3.Refining Expression Evaluation Order for Idiomatic C++ I've found:
The value computation and associated side-effect of the postfix-expression are sequenced before those of the expressions in the expression-list. The initializations of the declared parameters are indeterminately sequenced with no interleaving.
But I didn't find it in standard, instead in standard I've found:
6.8.1.8 Sequential execution [intro.execution] An expression X is said to be sequenced before an expression Y if every value computation and every side effect associated with the expression X is sequenced before every value computation and every side effect associated with the expression Y.
6.8.1.9 Sequential execution [intro.execution] Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
7.6.19.1 Comma operator [expr.comma] A pair of expressions separated by a comma is evaluated left-to-right;...
So, I compared according behavior in three compilers for 14 and 17 standards. The explored code is:
#include <iostream>
struct A
{
A& addInt(int i)
{
std::cout << "add int: " << i << "\n";
return *this;
}
A& addFloat(float i)
{
std::cout << "add float: " << i << "\n";
return *this;
}
};
int computeInt()
{
std::cout << "compute int\n";
return 0;
}
float computeFloat()
{
std::cout << "compute float\n";
return 1.0f;
}
void compute(float, int)
{
std::cout << "compute\n";
}
int main()
{
A a;
a.addFloat(computeFloat()).addInt(computeInt());
std::cout << "Function call:\n";
compute(computeFloat(), computeInt());
}
Results (the more consistent is clang):
<style type="text/css">
.tg {
border-collapse: collapse;
border-spacing: 0;
border-color: #aaa;
}
.tg td {
font-family: Arial, sans-serif;
font-size: 14px;
padding: 10px 5px;
border-style: solid;
border-width: 1px;
overflow: hidden;
word-break: normal;
border-color: #aaa;
color: #333;
background-color: #fff;
}
.tg th {
font-family: Arial, sans-serif;
font-size: 14px;
font-weight: normal;
padding: 10px 5px;
border-style: solid;
border-width: 1px;
overflow: hidden;
word-break: normal;
border-color: #aaa;
color: #fff;
background-color: #f38630;
}
.tg .tg-0pky {
border-color: inherit;
text-align: left;
vertical-align: top
}
.tg .tg-fymr {
font-weight: bold;
border-color: inherit;
text-align: left;
vertical-align: top
}
</style>
<table class="tg">
<tr>
<th class="tg-0pky"></th>
<th class="tg-fymr">C++14</th>
<th class="tg-fymr">C++17</th>
</tr>
<tr>
<td class="tg-fymr"><br>gcc 9.0.1<br></td>
<td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
<td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
</tr>
<tr>
<td class="tg-fymr">clang 9</td>
<td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute float<br>compute int<br>compute</td>
<td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute float<br>compute int<br>compute</td>
</tr>
<tr>
<td class="tg-fymr">msvs 2017</td>
<td class="tg-0pky">compute int<br>compute float<br>add float: 1<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
<td class="tg-0pky">compute float<br>add float: 1<br>compute int<br>add int: 0<br>Function call:<br>compute int<br>compute float<br>compute</td>
</tr>
</table>