Confusion about bsr and lzcnt

前端 未结 2 1116
执笔经年
执笔经年 2021-02-20 04:40

I\'m a bit confused about both instructions. First let\'s discard the special case when the scanned value is 0 and the undefined/bsr or bitsize/lzcnt result - this difference is

2条回答
  •  独厮守ぢ
    2021-02-20 05:28

    To be clear, there is no working fallback from lzcnt to bsr. What happened is that Intel used the previously redundant sequence rep bsr to encode the new lzcnt instruction. Using a redudant rep prefix for bsr was generally defined to be ignored, but with the caveat that it may decode differently on future CPUs1.

    So if you happen to execute lzcnt on a CPU that doesn't support it, it will execute as bsr. Of course, this fallback is not exactly intentional, and it gives the wrong result (as Paul R points out they look at the same bit but report it differently): it is just a consequence of the way the new instruction was encoded and how pointless rep prefixes were treated by prior CPUs. So the world fallback is pretty much entirely inappropriate for lzcnt and bsr.

    The situation is more subtle for the case of tzcnt and bsf. It uses the same encoding trick: tzcnt has the same encoding as rep bsf, but here the "fallback" mostly works since tzcnt returns the same value as bsf for all inputs except zero. For zero inputs tzcnt returns 32, but bsf leaves the destination undefined.

    You can't really use even this fallback though: if you never have zero inputs you might as well just use bsf, saving a byte and being compatible with a couple decades of CPUs, and if you do have zero inputs the behavior differs.

    So the behavior is perhaps better classified as trivia than a fallback...


    1 Normally this would more or less be esoterica, but you could for example use rep prefixes are where they have no functional effect to lengthen instructions to help align subsequent code without inserting explicit nop instructions. Given the "may decode differently in the future" this would be dangerous when compiling code which may run on any future CPU.

提交回复
热议问题