There\'s been some debate going on in this question about whether the following code is legal C++:
std::list- ::iterator i = items.begin();
while
To build on Bill the Lizard's answer:
int i = 1;
foo(i++, i++);
might also result in a function call of
foo(1, 1);
(meaning that the actuals are evaluated in parallel, and then the postops are applied).
-- MarkusQ
To build on MarkusQ's answer: ;)
Or rather, Bill's comment to it:
(Edit: Aw, the comment is gone again... Oh well)
They're allowed to be evaluated in parallel. Whether or not it happens in practice is technically speaking irrelevant.
You don't need thread parallelism for this to occur though, just evaluate the first step of both (take the value of i) before the second (increment i). Perfectly legal, and some compilers may consider it more efficient than fully evaluating one i++ before starting on the second.
In fact, I'd expect it to be a common optimization. Look at it from an instruction scheduling point of view. You have the following you need to evaluate:
But there's really no dependency between the left and the right argument. Argument evaluation happens in an unspecified order, and need not be done sequentially either (which is why new() in function arguments is usually a memory leak, even when wrapped in a smart pointer) It's also undefined what happens when you modify the same variable twice in the same expression. We do have a dependency between 1 and 2, however, and between 3 and 4. So why would the compiler wait for 2 to complete before computing 3? That introduces added latency, and it'll take even longer than necessary before 4 becomes available. Assuming there's a 1 cycle latency between each, it'll take 3 cycles from 1 is complete until the result of 4 is ready and we can call the function.
But if we reorder them and evaluate in the order 1, 3, 2, 4, we can do it in 2 cycles. 1 and 3 can be started in the same cycle (or even merged into one instruction, since it's the same expression), and in the following, 2 and 4 can be evaluated. All modern CPU's can execute 3-4 instructions per cycle, and a good compiler should try to exploit that.
To build on Kristo's answer,
foo(i++, i++);
yields undefined behavior because the order that function arguments are evaluated is undefined (and in the more general case because if you read a variable twice in an expression where you also write it, the result is undefined). You don't know which argument will be incremented first.
int i = 1;
foo(i++, i++);
might result in a function call of
foo(2, 1);
or
foo(1, 2);
or even
foo(1, 1);
Run the following to see what happens on your platform:
#include <iostream>
using namespace std;
void foo(int a, int b)
{
cout << "a: " << a << endl;
cout << "b: " << b << endl;
}
int main()
{
int i = 1;
foo(i++, i++);
}
On my machine I get
$ ./a.out
a: 2
b: 1
every time, but this code is not portable, so I would expect to see different results with different compilers.
Quoth the C++ standard 1.9.16:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. (Note: Value computations and side effects associated with the different argument expressions are unsequenced.)
So it would seem to me that this code:
foo(i++);
is perfectly legal. It will increment i
and then call foo
with the previous value of i
. However, this code:
foo(i++, i++);
yields undefined behavior because paragraph 1.9.16 also says:
If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
++Kristo!
The C++ standard 1.9.16 makes a lot of sense with respect to how one implements operator++(postfix) for a class. When that operator++(int) method is called, it increments itself and returns a copy of the original value. Exactly as the C++ spec says.
It's nice to see standards improving!
However, I distinctly remember using older (pre-ANSI) C compilers wherein:
foo -> bar(i++) -> charlie(i++);
Did not do what you think! Instead it compiled equivalent to:
foo -> bar(i) -> charlie(i); ++i; ++i;
And this behavior was compiler-implementation dependent. (Making porting fun.)
It's easy enough to test and verify that modern compilers now behave correctly:
#define SHOW(S,X) cout << S << ": " # X " = " << (X) << endl
struct Foo
{
Foo & bar(const char * theString, int theI)
{ SHOW(theString, theI); return *this; }
};
int
main()
{
Foo f;
int i = 0;
f . bar("A",i) . bar("B",i++) . bar("C",i) . bar("D",i);
SHOW("END ",i);
}
...And building on pretty much EVERYONE's answers... (Thanks guys!)
I think we need spell this out a bit better:
Given:
baz(g(),h());
Then we don't know whether g() will be invoked before or after h(). It is "unspecified".
But we do know that both g() and h() will be invoked before baz().
Given:
bar(i++,i++);
Again, we don't know which i++ will be evaluated first, and perhaps not even whether i will be incremented once or twice before bar() is called. The results are undefined! (Given i=0, this could be bar(0,0) or bar(1,0) or bar(0,1) or something really weird!)
Given:
foo(i++);
We now know that i will be incremented before foo() is invoked. As Kristo pointed out from the C++ standard section 1.9.16:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. -- end note ]
Though I think section 5.2.6 says it better:
The value of a postfix ++ expression is the value of its operand. [ Note: the value obtained is a copy of the original value -- end note ] The operand shall be a modifiable lvalue. The type of the operand shall be an arithmetic type or a pointer to a complete effective object type. The value of the operand object is modified by adding 1 to it, unless the object is of type bool, in which case it is set to true. [ Note: this use is deprecated, see Annex D. -- end note ] The value computation of the ++ expression is sequenced before the modification of the operand object. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single postfix ++ operator. -- end note ] The result is an rvalue. The type of the result is the cv-unqualified version of the type of the operand. See also 5.7 and 5.17.
The standard, in section 1.9.16, also lists (as part of its examples):
i = 7, i++, i++; // i becomes 9 (valid)
f(i = -1, i = -1); // the behavior is undefined
And we can trivially demonstrate this with:
#define SHOW(X) cout << # X " = " << (X) << endl
int i = 0; /* Yes, it's global! */
void foo(int theI) { SHOW(theI); SHOW(i); }
int main() { foo(i++); }
So, yes, i is incremented before foo() is invoked.
All this makes a lot of sense from the perspective of:
class Foo
{
public:
Foo operator++(int) {...} /* Postfix variant */
}
int main() { Foo f; delta( f++ ); }
Here Foo::operator++(int) must be invoked prior to delta(). And the increment operation must be completed during that invocation.
In my (perhaps overly complex) example:
f . bar("A",i) . bar("B",i++) . bar("C",i) . bar("D",i);
f.bar("A",i) must be executed to obtain the object used for object.bar("B",i++), and so on for "C" and "D".
So we know that i++ increments i prior to calling bar("B",i++) (even though bar("B",...) is invoked with the old value of i), and therefore i is incremented prior to bar("C",i) and bar("D",i).
Getting back to j_random_hacker's comment:
j_random_hacker writes:
+1, but I had to read the standard carefully to convince myself that this was OK. Am I right in thinking that, if bar() was instead a global function returning say int, f was an int, and those invocations were connected by say "^" instead of ".", then any of A, C and D could report "0"?
This question is a lot more complicated than you might think...
Rewriting your question as code...
int bar(const char * theString, int theI) { SHOW(...); return i; }
bar("A",i) ^ bar("B",i++) ^ bar("C",i) ^ bar("D",i);
Now we have only ONE expression. According to the standard (section 1.9, page 8, pdf page 20):
Note: operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.(7) For example, in the following fragment: a=a+32760+b+5; the expression statement behaves exactly the same as: a=(((a+32760)+b)+5); due to the associativity and precedence of these operators. Thus, the result of the sum (a+32760) is next added to b, and that result is then added to 5 which results in the value assigned to a. On a machine in which overflows produce an exception and in which the range of values representable by an int is [-32768,+32767], the implementation cannot rewrite this expression as a=((a+b)+32765); since if the values for a and b were, respectively, -32754 and -15, the sum a+b would produce an exception while the original expression would not; nor can the expression be rewritten either as a=((a+32765)+b); or a=(a+(b+32765)); since the values for a and b might have been, respectively, 4 and -8 or -17 and 12. However on a machine in which overflows do not produce an exception and in which the results of overflows are reversible, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur. -- end note ]
So we might think that, due to precedence, that our expression would be the same as:
(
(
( bar("A",i) ^ bar("B",i++)
)
^ bar("C",i)
)
^ bar("D",i)
);
But, because (a^b)^c==a^(b^c) without any possible overflow situations, it could be rewritten in any order...
But, because bar() is being invoked, and could hypothetically involve side effects, this expression cannot be rewritten in just any order. Rules of precedence still apply.
Which nicely determines the order of evaluation of the bar()'s.
Now, when does that i+=1 occur? Well it still has to occur before bar("B",...) is invoked. (Even though bar("B",....) is invoked with the old value.)
So it's deterministically occurring before bar(C) and bar(D), and after bar(A).
Answer: NO. We will always have "A=0, B=0, C=1, D=1", if the compiler is standards-compliant.
But consider another problem:
i = 0;
int & j = i;
R = i ^ i++ ^ j;
What is the value of R?
If the i+=1 occurred before j, we'd have 0^0^1=1. But if the i+=1 occurred after the whole expression, we'd have 0^0^0=0.
In fact, R is zero. The i+=1 does not occur until after the expression has been evaluated.
Which I reckon is why:
i = 7, i++, i++; // i becomes 9 (valid)
Is legal... It has three expressions:
And in each case, the value of i is changed at the conclusion of each expression. (Before any subsequent expressions are evaluated.)
PS: Consider:
int foo(int theI) { SHOW(theI); SHOW(i); return theI; }
i = 0;
int & j = i;
R = i ^ i++ ^ foo(j);
In this case, i+=1 has to be evaluated before foo(j). theI is 1. And R is 0^0^1=1.
Sutter's Guru of the Week #55 (and the corresponding piece in "More Exceptional C++") discusses this exact case as an example.
According to him, it is perfectly valid code, and in fact a case where trying to transform the statement into two lines:
items.erase(i); i++;
does not produce code that is semantically equivalent to the original statement.