Conflicting signs in x86 assembly: movsx then unsigned compare/branch?

前端 未结 3 966
情歌与酒
情歌与酒 2021-01-19 13:53

I am confused in the following snippet:

movsx   ecx, [ebp+var_8] ; signed move
cmp     ecx, [ebp+arg_0]
jnb     short loc_401027 ; unsigned jump
3条回答
  •  爱一瞬间的悲伤
    2021-01-19 14:14

    Addendum to anatolyg answer:

    In the principle, there's no clash on the assembly level.

    The information in computer is encoded in bits (one bit = zero or one), and the ecx is 32 bits of information, nothing else.

    Whether you interpret the top bit as sign or not, that's up to the following code, i.e. on assembly level it's perfectly legal to use movsx to extend the value (in signed-like way), even if you interpret it later as bit mask or unsigned int.

    Whether there's clash on logical level depends on the planned functionality by author. If the author did want that test against arg_0to not branch if var_8 is "negative" value and arg_0 < 231, then the code is correct.

    BTW the disassembly is missing information about the size of argument in the first movsx, so the disassembly tool producing this is confusing (is it otherwise good? Be cautious).

    So, is var_8 signed or unsigned? And what about arg_0?

    var_8 is first and foremost memory address, and from there either 8 or 16 bits of information is used (not clear from your disassembly, which one) - in "signed" way. But it's difficult to tell more about var_8 without exploring full code, it may even be the var_8 is 32 bit unsigned int "variable", but for some reason the author decides to use only sing-extended low 16 bits of its content in that first movsx. arg_0 is then used as unsigned 32 bit integer for the cmp instruction.

    In assembly the question is not as much whether var_8 is signed or unsigned, the question in assembly is how many bits of information you have and where, and what's the interpretation of those bits by the following code.

    There's lot more freedom in this than in C or other high level programming languages, for example if you have four byte counter in memory, which you know each of them is less than 200, and you want to increment first and last of them, you can do this:

    .data
    counter1: db 13
    counter2: db 6
    counter3: db 34
    counter4: db 17
    
    .text
        ...
        ; increment first and last counter in one instruction
        ; overflow not-expected/handled, counters should to be < 200
        add  dword [counter1],0x01000001
    

    Now (imagine) how will you interpret this when disassembling such code, not having the original comments from the source above? Will get tricky, if you don't understand from the other code the counter1-4 are used as separate byte counters, and this is speed optimization to increment two of them in single instruction.

提交回复
热议问题