When I compile and run this code with Clang (-O3
) or MSVC (/O2
)...
#include
#include
static int con
++a[j]; // This is undefined behavior too, but Clang doesn't see it
Are you saying this is undefined behavior because the array elements are uninitialized?
If so, although this is a common interpretation of clause 4.1/1 in the standard I believe it is incorrect. The elements are 'uninitialized' in the sense that programmers usually use this term, but I do not believe this corresponds exactly to the C++ specification's use of the term.
In particular C++11 8.5/11 states that these objects are in fact default initialized, and this seems to me to be mutually exclusive with being uninitialized. The standard also states that for some objects being default initialized means that 'no initialized is performed'. Some might assume this means that they are uninitialized but this is not specified and I simply take it to mean that no such performance is required.
The spec does make clear that the array elements will have indeterminant values. C++ specifies, by reference to the C standard, that indeterminant values can be either valid representations, legal to access normally, or trap representations. If the particular indeterminant values of the array elements happen to all be valid representations, (and none are INT_MAX, avoiding overflow) then the above line does not trigger any undefined behavior in C++11.
Since these array elements could be trap representations it would be perfectly conformant for clang to act as though they are guaranteed to be trap representations, effectively choosing to make the code UB in order to create an optimization opportunity.
Even if clang doesn't do that it could still choose to optimize based on the dataflow. Clang does know how to do that, as demonstrated by the fact that if the inner loop is changed slightly then the loops do get removed.
So then why does the (optional) presence of UB seem to stymie optimization, when UB is usually taken as an opportunity for more optimization?
What may be going on is that clang has decided that users want int
trapping based on the hardware's behavior. And so rather than taking traps as an optimization opportunity, clang has to generate code which faithfully reproduces the program behavior in hardware. This means that the loops cannot be eliminated based on dataflow, because doing so might eliminate hardware traps.
C++14 updates the behavior such that accessing indeterminant values itself produces undefined behavior, independent of whether one considers the variable uninitialized or not: https://stackoverflow.com/a/23415662/365496