On Undefined Behavior

前端 未结 7 2110
盖世英雄少女心
盖世英雄少女心 2021-01-02 04:14

Generally, UB is regarded as being something that has to be avoided, and the current C standard itself lists quite a few examples in appendix J.

However, there are c

相关标签:
7条回答
  • 2021-01-02 04:25

    No! Just because it compiles, runs and gives the output you hoped for does not make it correct.

    0 讨论(0)
  • 2021-01-02 04:37

    No, unless you're also keeping your compiler the same and your compiler documentation defines the otherwise undefined behavior.

    Undefined behavior means that your compiler can ignore your code for any reason, making things true that you don't think should be.
    Sometimes this is for optimization, and sometimes it's because of architecture restrictions like this.


    I suggest you read this, which addresses your exact example. An excerpt:

    Signed integer overflow:

    If arithmetic on an int type (for example) overflows, the result is undefined. One example is that INT_MAX + 1 is not guaranteed to be INT_MIN. This behavior enables certain classes of optimizations that are important for some code.

    For example, knowing that INT_MAX + 1 is undefined allows optimizing X + 1 > X to true. Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing X * 2 / 2 to X. While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for <= loops like this:

    for (i = 0; i <= N; ++i) { ... }
    

    In this loop, the compiler can assume that the loop will iterate exactly N + 1 times if i is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses int as induction variables.

    0 讨论(0)
  • 2021-01-02 04:43

    In general, it's better to completely avoid it. On the other hand, if your compiler documentation explicitly states that that specific thing that is UB for the standard is instead defined for that compiler, you may exploit it, possibly adding some #ifdef/#error machinery to block the compilation in case another compiler is used.

    0 讨论(0)
  • 2021-01-02 04:45

    If a C (or other language) standard declares that some particular code will have Undefined Behavior in some situation, that means that a C compiler can generate code to do whatever it wants in that situation, while remaining compliant with that standard. Many particular language implementations have documented behaviors which go beyond what is required by the generic language standard. For example, Whizbang Compilers Inc. might explicitly specify that its particular implementation of memcpy will always copy individual bytes in address order. On such a compiler, code like:

      unsigned char z[256];
      z[0] = 0x53;
      z[1] = 0x4F;
      memcpy(z+2, z, 254);
    

    would have behavior which was defined by the Whizbang documentation, even though the behavior of such code is not specified by any non-vendor-specific C language specification. Such code would be compatible with compilers that comply with Whizbang's spec, but could be incompatible with other compilers which comply with various C standards but do not comply with Whizbang's specifications.

    There are many situations, especially with embedded systems, where programs will need to do some things which the C standards do not require compilers to allow. It is not possible to write such programs to be compatible with all standards-compliant compilers, since some standards-compliant compilers may not provide any way to do what needs to be done, and even those that do might require different syntax. Nonetheless, there is often considerable value in writing code that will be run correctly by any standards-compliant compiler.

    0 讨论(0)
  • 2021-01-02 04:47

    If you know for a fact that your code will only be targeting a specific architecture, compiler, and OS, and you know how the undefined behavior works (and that that won't change), then it's not inherently wrong to use it occasionally. In your example, I think I can tell what's going to happen as well.

    However, UB is rarely a preferred solution. If there's a cleaner way, use it. Using undefined behavior should really never be absolutely necessary, but it might be convenient in a few cases. Never rely on it. And as always, comment your code if you ever rely on UB.

    And please, don't ever publish code that relies on undefined behavior, because it'll just end up blowing up in someone's face when they compile it on a system with a different implementation than the one that you relied on.

    0 讨论(0)
  • 2021-01-02 04:52

    No.

    The compiler take advantage of undefined behavior when optimizing the code. A well-known example is the strict overflow semantics in GCC compiler (search for strict-overflow here) For example, this cycle

    for (int i = 1; i != 0; ++i)
      ...
    

    supposedly relies on your "machine dependent" overflow behavior of signed integer type. However, the GCC compiler under the rules of strict overflow semantics can (and will) assume that incrementing an int variable can only make it larger, and never smaller. This assumption will make GCC optimize-out the arithmetics and generate an endless cycle instead

    for (;;)
      ...
    

    since this is a perfectly valid manifestation of undefined behavior.

    Basically, there's no such thing as "machine-dependent behavior" in C language. All behavior is determined by the implementation and the level of implementation is the lowest level you can ever get to. Implementation isolates you from the raw machine and isolates you perfectly. There's no way to break through that isolation and get to the actual raw machine, unless the implementation explicitly permits you to do so. Signed integer overflow is normally not one of those contexts where you are allowed to access the raw machine.

    0 讨论(0)
提交回复
热议问题