Is left and right shifting negative integers defined behavior?

后端 未结 3 2005
故里飘歌
故里飘歌 2020-11-27 18:24

I know, right shifting a negative signed type depends on the implementation, but what if I perform a left shift? For example:

int i = -1;
i << 1;


        
相关标签:
3条回答
  • 2020-11-27 19:05

    When the C standards were codified, different platforms would do different things when left-shifting negative integers. On some of them, the behavior might trigger implementation-specific traps whose behavior could be outside a program's control, and which could include random code execution. Nonetheless, it was possible that programs written for such platforms might make use of such behavior (a program could e.g. specify that a user would have to do something to configure a system's trap handlers before running it, but the program could then exploit the behavior of the suitably-configured trap handlers).

    The authors of the C standard did not want to say that compilers for machines where left-shifting of negative numbers would trap must be modified to prevent such trapping (since programs might potentially be relying upon it), but if left-shifting a negative number is allowed to trigger a trap which could cause any arbitrary behavior (including random code execution) that means that left-shifting a negative number is allowed to do anything whatsoever. Hence Undefined Behavior.

    In practice, until about 5 years ago, 99+% of compilers written for a machine that used two's-complement math (meaning 99+% of machines made since 1990) would consistently yield the following behaviors for x<<y and x>>y, to the extent that code reliance upon such behavior was considered no more non-portable than code which assumed char was 8 bits. The C standard didn't mandate such behavior, but any compiler author wanting to be compatible with a wide base of existing code would follow it.

    • if y is a signed type, x << y and x >> y are evaluated as though y was cast to unsigned.
    • if x is type int, x<<y is equivalent to (int)((unsigned)x << y).
    • if x is type int and positive, x>>y equivalent to (unsigned)x >> y. If x is of type int and negative, x>>y is equivalent to `~(~((unsigned)x) >> y).
    • If x is of type long, similar rules apply, but with unsigned long rather than unsigned.
    • if x is an N-bit type and y is greater than N-1, then x >> y and x << y may arbitrarily yield zero, or may act as though the right-hand operand was y % N; they may require extra time proportional to y [note that on a 32-bit machine, if y is negative, that could potentially be a long time, though I only know of one machine which would in practice run more than 256 extra steps]. Compilers were not necessarily consistent in their choice, but would always return one of the indicated values with no other side-effects.

    Unfortunately for some reason I can't quite fathom, compiler writers have decided that rather than allowing programmers to indicate what assumptions compilers should use for dead-code removal, compilers should assume that it is impossible to execute any shift whose behavior isn't mandated by the C standard. Thus, given code like the following:

    uint32_t shiftleft(uint32_t v, uint8_t n)
    {
      if (n >= 32)
        v=0;
      return v<<n;
    }
    

    a compiler may determine that because the code would engage in Undefined Behavior when n is 32 or larger, the compiler may assume that the if will never return true, and may thus omit the code. Consequently, unless or until someone comes up with a standard for C which restores the classic behaviors and allows programmers to designate what assumptions merit dead code removal, such constructs cannot be recommended for any code that might be fed to a hyper-modern compiler.

    0 讨论(0)
  • 2020-11-27 19:14

    otherwise, the behavior is undefined.

    This includes the

    if E1 has a signed type and non-negative value

    0 讨论(0)
  • 2020-11-27 19:20

    You're not reading that sentence correctly. The standard defines it if: the left operand has a signed type and a non-negative value and the result is representable (and previously in the same paragraph defines it for unsigned types). In all other cases (notice the use of the semicolon in that sentence), i.e, if any of these conditions isn't verified, the behaviour is undefined.

    0 讨论(0)
提交回复
热议问题