Sign extension with bitwise shift operation

前端 未结 5 1274
小蘑菇
小蘑菇 2021-01-06 02:23

following this Q&A I tried to exam the answer so I wrote:

#include 

int main ()
{

        int t;int i;
        for (i=120;i<140;i++){         


        
相关标签:
5条回答
  • 2021-01-06 02:54

    The >> operator, right shift, is arithmetic right shift in most compilers, meaning divide-by-2.

    So, if, e.g. int i ==-4 (0xfffffffc), then i>>1 == -2 (0xfffffffe).

    Having said this, I would recommend you to check the assembly of your code.
    e.g. x86 has 2 separate instructions - shr & sar, denoting logical shift & arithmetic shift respectively.
    Generally, compilers use shr (logical shift) for unsigned variables & sar (arithmetic shift) for signed variables.


    Below is the C code & corresponding assembly, generated with gcc -S:

    a.c:

    int x=10;
    unsigned int y=10;
    
    int main(){
        unsigned int z=(x>>1)+(y>>1);
        return 0;
    }
    

    a.s:

        .file   "a.c"
    .globl x
        .data
        .align 4
        .type   x, @object
        .size   x, 4
    x:
        .long   10
    .globl y
        .align 4
        .type   y, @object
        .size   y, 4
    y:
        .long   10
        .text
    .globl main
        .type   main, @function
    main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $16, %esp
        movl    x, %eax
        sarl    %eax ; <~~~~~~~~~~~~~~~~ Arithmetic shift, for signed int
        movl    y, %edx
        shrl    %edx ; <~~~~~~~~~~~~~~~~ Logical shift, for unsigned int
        addl    %edx, %eax
        movl    %eax, -4(%ebp)
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
        .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
        .section    .note.GNU-stack,"",@progbits
    
    0 讨论(0)
  • 2021-01-06 02:56

    From: Right shifting negative numbers in C

    Edit: According to the Section 6.5.7 of the latest draft standard, this behavior on negative numbers is implementation dependent:

    The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

    And, your implementation is probably doing an Arithmetic Shift with two's complement numbers


    Operator >> as Signed right shift or arithmetic right shift, shift all the bits to right a specified number of times. Important is >> fills leftmost sign bit (Most Significant Bit MSB) to leftmost bit the after shift. This is called sign extension and serves to preserve the sign of negative numbers when you shift them right.

    Below is my diagrammatic representation with an example to show how this works (for one byte):
    Example:

    i = -5 >> 3;  shift bits right three time 
    

    Five in two's complement form is 1111 1011 Memory Representation:

     MSB
    +----+----+----+---+---+---+---+---+
    |  1 |  1 | 1  | 1 | 1 | 0 | 1 | 1 |   
    +----+----+----+---+---+---+---+---+
       7    6   5    4   3   2   1   0  
      ^  This seventh, the left most bit is SIGN bit  
    

    And below is, how >> works? When you do -5 >> 3

                            this 3 bits are shifted 
                             out and loss
     MSB                   (___________)      
    +----+----+----+---+---+---+---+---+
    |  1 |  1 | 1  | 1 | 1 | 0 | 1 | 1 |   
    +----+----+----+---+---+---+---+---+
      | \                 \  
      |  ------------|     ----------|
      |              |               |
      ▼              ▼               ▼
    +----+----+----+---+---+---+---+---+
    |  1 |  1 | 1  | 1 | 1 | 1 | 1 | 1 |
    +----+----+----+---+---+---+---+---+
    (______________)
     The sign is        
     propagated
    

    Notice: the left most three bits are one because on each shift sign bit is preserved and each bit is right too. I have written The sign is propagated because all this three bits are because of sign(but not data).

    [ANSWER]
    In your output in

    first Eight lines

          ~t is 0
    ==>    t is FFFFFFFF 
    ==>    t is -1
    

    (Note: 2's complement of -1 is FFFFFFFF, because 1 = 00000001, 1's complement of 1 is FFFFFFFE, and 2's complement = 1's complement + 1 that is: FFFFFFFE + 00000001 = FFFFFFFF)

    So t is always evaluated -1 in first eight times in loop. Yes, How ?

    In for loop

    for (i=120;i<140;i++){
         t = (i - 128) >> 31;
    

    values of i for first eight time is i = 120, 121, 122, 123, 124, 125, 126 ,127 all eight values are less then 128. So return of (i - 128) = -8, -7, -6, -5, -4, -3, -2, -1. hence in first eight times expression t = (i - 128) >> 31 shift-rights a negative number.

    t =   (i - 128)  >> 31
    t =  -ve number  >> 31
    

    Because in your system int is of 4 bytes = 32 bits, all right most 31 bits are shift-out and loss and due to propagation of sign bit that is 1 for negative number all bits value becomes 1. (as I have shown in above figure for one byte)

    So for fist eight times:

        t =  -ve number  >> 31 ==  -1 
        t = -1
      and this gives 
        ~t = 0
    

    Hence output of fist eight times for ~t is 0.

    For remaining last lines

          ~t is FFFFFFFF
    ==>   ~t is -1   
    ==>    t is 0 
    

    For remaining last lines, in for loop

    for (i=120;i<140;i++){
         t = (i - 128) >> 31;
    

    i values are 128, 129, 130, 132, 133, 134, 135, 136, 137, 138, 139, all are greater then or equals to 128. and sign bit is 0.

    So (i - 128) for remaining last lines is >=0 and for all this MSB sign bit = 0. And because again you shifts right 31 times all bits excepts then sigh bit shift-out and sign bit 0 propagates and fills all bits with 0 and magnitude becomes 0.

    I think it would be good if I write an example for a positive number too. So we take an example 5 >> 3 and five is one byte is 0000 0101

                            this 3 bits are shifted 
                             out and loss
     MSB                   (___________)      
    +----+----+----+---+---+---+---+---+
    |  0 |  0 | 0  | 0 | 0 | 1 | 0 | 1 |   
    +----+----+----+---+---+---+---+---+
      | \                 \  
      |  ------------|     ----------|
      |              |               |
      ▼              ▼               ▼
    +----+----+----+---+---+---+---+---+
    |  0 |  0 | 0  | 0 | 0 | 0 | 0 | 0 |
    +----+----+----+---+---+---+---+---+
    (______________)
     The sign is        
     propagated
    

    See again I writes The sign is propagated, So leftmost three zeros are due to sign bit.

    So this is what operator >> Signed right shift do, and preserves the sign of left operand.

    0 讨论(0)
  • 2021-01-06 03:01

    Why does t = (i - 128) >> 31 gives zero or -1 for each number?

    When a non-negative 32-bit integer is shifted right 31 positions, all non-zero bits get shifted out and the most significant bits get filled with 0, so you end up with 0.

    Usually, when a negative 32-bit integer is shifted right 31 positions, the most significant bits don't get filled with 0, instead they get set to the sign of the number and so the sign propagates into all bits and in 2's complement representation all bits set to 1 amount to -1. The net effect is as if you repeatedly divided the number by 2, but with a little twist... The result is rounded towards -infinity instead of towards 0. For example -2>>1==-1 but -3>>1==-2 and -5>>1==-3. This is called arithmetic right shift.

    When I say "usually", I mean the fact that the C standard allows several different behaviors for right shifts of negative values. On top of that, it allows non-2's complement representations of signed integers. Usually, however, you have the 2's complement representation and the behavior that I've showed/explained above.

    0 讨论(0)
  • 2021-01-06 03:08

    Becaue t is either 0 or -1, ~t is also always -1 or 0.

    This is due to the (implementation defined) behaviour or (i - 128) >> 31, which essentially copies the top bit of (i-128) [assuming 32-bit integers]. If i is > 128, it will result in a zero in the top bit. If i is less than 128, the result is negative, so the top bit is set.

    Since ~t is "all bits opposite of t", you can expect that t is always 0xffffffff if t is zero.

    0 讨论(0)
  • 2021-01-06 03:17

    The rule in C and C++ is that the result of a right-shift of a negative value is implementation defined. So read your compiler's documentation. The various explanations you've gotten are valid approaches, but none of these is mandated by the language definition.

    0 讨论(0)
提交回复
热议问题