Is it legal for a C++ optimizer to reorder calls to clock()?

前端未结

关注

 7  1850

The C++ Programming Language 4th edition, page 225 reads: A compiler may reorder code to improve performance as long as the result is identical to that of the s

相关标签:

7条回答

滥情空心

2020-12-02 11:28

If veryLongComputation() internally performs any opaque function call, then no, because the compiler cannot guarantee that its side effects would be interchangeable with those of clock().

Otherwise, yes, it is interchangeable.
This is the price you pay for using a language in which time isn't a first-class entity.

Note that memory allocation (such as new) can fall in this category, as allocation function can be defined in a different translation unit and not compiled until the current translation unit is already compiled. So, if you merely allocate memory, the compiler is forced to treat the allocation and deallocation as worst-case barriers for everything -- clock(), memory barriers, and everything else -- unless it already has the code for the memory allocator and can prove that this is not necessary. In practice I don't think any compiler actually looks at the allocator code to try to prove this, so these types of function calls serve as barriers in practice.

0 讨论(0)

发布评论:

提交评论

加载中...

执念已碎

2020-12-02 11:30

Let's suppose that the sequence is in a loop, and the veryLongComputation () randomly throws an exception. Then how many t0s and t1s will be calculated? Does it pre-calculate the random variables and reorder based on the precalculation - sometimes reordering and sometimes not?

Is the compiler smart enough to know that just a memory read is a read from shared memory. The read is a measure of how far the control rods have moved in a nuclear reactor. The clock calls are used to control the speed at which they are moved.

Or maybe the timing is controlling the grinding of a Hubble telescope mirror. LOL

Moving clock calls around seems too dangerous to leave to the decisions of compiler writers. So if it is legal, perhaps the standard is flawed.

IMO.

0 讨论(0)

发布评论:

提交评论

加载中...

没有蜡笔的小新

2020-12-02 11:32

The compiler cannot exchange the two clock calls. t1 must be set after t0. Both calls are observable side effects. The compiler may reorder anything between those observable effects, and even over an observable side effect, as long as the observations are consistent with possible observations of an abstract machine.

Since the C++ abstract machine is not formally restricted to finite speeds, it could execute veryLongComputation() in zero time. Execution time itself is not defined as an observable effect. Real implementations may match that.

Mind you, a lot of this answer depends on the C++ standard not imposing restrictions on compilers.

0 讨论(0)

发布评论:

提交评论

加载中...

陌清茗

2020-12-02 11:33

Yes, it is legal - if the compiler can see the entirety of the code that occurs between the clock() calls.

0 讨论(0)

发布评论:

提交评论

加载中...

天命终不由人

2020-12-02 11:45

At least by my reading, no, this is not allowed. The requirement from the standard is (§1.9/14):

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

The degree to which the compiler is free to reorder beyond that is defined by the "as-if" rule (§1.9/1):

This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.

That leaves the question of whether the behavior in question (the output written by cout) is officially observable behavior. The short answer is that yes, it is (§1.9/8):

The least requirements on a conforming implementation are:
[...]
— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

At least as I read it, that means the calls to clock could be rearranged compared to the execution of your long computation if and only if it still produced identical output to executing the calls in order.

If, however, you wanted to take extra steps to ensure correct behavior, you could take advantage of one other provision (also §1.9/8):

— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

To take advantage of this, you'd modify your code slightly to become something like:

auto volatile t0 = clock(); auto volatile r = veryLongComputation(); auto volatile t1 = clock();

Now, instead of having to base the conclusion on three separate sections of the standard, and still having only a fairly certain answer, we can look at exactly one sentence, and have an absolutely certain answer--with this code, re-ordering uses of clock vs., the long computation is clearly prohibited.

0 讨论(0)

发布评论:

提交评论

加载中...

一整个雨季

2020-12-02 11:49

It is certainly not allowed, since it changes, as you have noted, the observeable behavior (different output) of the program (I won't go into the hypothetical case that veryLongComputation() might not consume any measurable time -- given the function's name, is presumably not the case. But even if that was the case, it wouldn't really matter). You wouldn't expect that it is allowable to reorder fopen and fwrite, would you.

Both t0 and t1 are used in outputting t1-t0. Therefore, the initializer expressions for both t0 and t1 must be executed, and doing so must follow all standard rules. The result of the function is used, so it is not possible to optimize out the function call, though it doesn't directly depend on t1 or vice versa, so one might naively be inclined to think that it's legal to move it around, why not. Maybe after the initialization of t1, which doesn't depend on the calculation?
Indirectly, however, the result of t1 does of course depend on side effects by veryLongComputation() (notably the computation taking time, if nothing else), which is exactly one of the reasons that there exist such a thing as "sequence point".

There are three "end of expression" sequence points (plus three "end of function" and "end of initializer" SPs), and at every sequence point it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed.
There is no way you can keep this promise if you move around the three statements, since the possible side effects of all functions called are not known. The compiler is only allowed to optimize if it can guarantee that it will keep the promise up. It can't, since the library functions are opaque, their code isn't available (nor is the code within veryLongComputation, necessarily known in that translation unit).

Compilers do however sometimes have "special knowledge" about library functions, such as some functions will not return or may return twice (think exit or setjmp).
However, since every non-empty, non-trivial function (and veryLongComputation is quite non-trivial from its name) will consume time, a compiler having "special knowledge" about the otherwise opaque clock library function would in fact have to be explicitly disallowed from reordering calls around this one, knowing that doing so not only may, but will affect the results.

Now the interesting question is why does the compiler do this anyway? I can think of two possibilities. Maybe your code triggers a "looks like benchmark" heuristic and the compiler is trying to cheat, who knows. It wouldn't be the first time (think SPEC2000/179.art, or SunSpider for two historic examples). The other possibility would be that somewhere inside veryLongComputation(), you inadvertedly invoke undefined behavior. In that case, the compiler's behavior would even be legal.

0 讨论(0)

发布评论:

提交评论

加载中...

1 2 下一页

验证码

看不清?

提交回复