Replacing arrays access variables with the right integer type

前端 未结 2 1733
清酒与你
清酒与你 2021-01-06 09:27

I\'ve had a habit of using int to access arrays (especially in for loops); however I recently discovered that I may have been \"doing-it-all-wrong\" and my x86 system kept h

2条回答
  •  北荒
    北荒 (楼主)
    2021-01-06 09:49

    The movslq instruction sign-extends a long (aka 4-byte quantity) to a quad (aka 8-byte quantity). This is because int is signed, so an offset of i.e. -1 is 0xffffffff as a long. If you were to just zero-extend that (i.e. not have movslq), this would be 0x00000000ffffffff, aka 4294967295, which is probably not what you want. So, the compiler instead sign-extends the index to yield 0xffff..., aka -1.

    The reason the other types don't require the additional operation is because, despite some of them being signed, they're still the same size of 8 bytes. And, thanks to two's complement, 0xffff... can be interpreted as either -1 or 18446744073709551615, and the 64-bit sum will still be the same.

    Now, normally, if you were to instead use unsigned int, the compiler would normally have to insert a zero-extend instead, just to make sure the upper-half of the register doesn't contain garbage. However, on the x64 platform, this is done implicitly; an instruction such as mov %eax,%esi will move whatever 4-byte quantity is in eax into the lower 4 bytes of rsi and clear the upper 4, effectively zero-extending the quantity. But, given your postings, the compiler seems to insert mov %esi,%esi instruction anyway, "just to be sure".

    Note, however, that this "automatic zero-extending" is not the case for 1- and 2-byte quantities - those must be manually zero-extended.

提交回复
热议问题