问题
I want to write a simple code (or algorithm) to set/clear overflow flag. For setting OF, I know that I can use signed values. But how can I clear that?
回答1:
There are many possible solutions.
For instance, test al, al
will clear the OF
flag without affecting register contents.
Or, if you don't want to affect the other flags, you can just directly modify the *FLAGS
register. For example, in 32-bit, this would look like:
pushfd ; Push EFLAGS onto the stack
and dword [esp], ~0x800 ; Clear bit 11 (OF)
popfd ; Pop the modified result back into EFLAGS
Edit: Changed or al, al
to test al, al
per Peter Cordes' recommendation. (The effects are the same but the latter is better for performance reasons)
回答2:
popf is quite slow (like one per 20 cycles on Skylake); if you need to clear or set OF then ideally do it as a side-effect of an ALU instruction, especially one you were going to use anyway for a useful computation you know won't or will overflow. (One that will overflow is usually harder to find, unlike for CF where you can always just sub
instead of add
with a constant that wraps almost all the way around for all inputs except a very small range).
If you need to set/clear just OF without affecting other condition-codes for some reason, then yes, pushf
/popf
is the way to go. lahf
/ sahf
doesn't get OF, because OF is bit 11 in EFLAGS, outside the low 8.
test al,al
(or any same,same register) clears OF and CF, just like comparing / subtracting zero. Other flags are usefully set according to the value.
xor eax,eax
clears EAX, and clears OF/SF/CF, sets ZF/PF. You often need a zeroed register anyway, so if you need OF clear (e.g. for the start of an adox extended-precision chain), then kill 2 birds with one stone and arrange your code so the last flag-setting instruction is the xor-zeroing.
In x86-64, you can also trust that using add
on a pointer + length doesn't cross over the middle of unsigned virtual address space, and thus clears OF
. But that assumption could break on future CPUs with fully 64-bit virtual addresses, because then there'd be no hole in virtual address space around the signed-wraparound boundary, so a single contiguous array could span it. And that can already happen in 32-bit code, running under a 64-bit kernel or a 32-bit kernel that doesn't use a 2G:2G kernel:user split of virtual address space.
xor eax, eax
/ cmp al, -128
sets OF, and only takes 4 bytes of code. It's is probably the cheapest way, and unlike sub
or whatever, it doesn't write any partial registers (or any full registers). It still leaves EAX zeroed.
0 - -128 wraps to -128, i.e. signed OF. An 8-bit 2's complement integer can only represent values from -128..+127
. The most-negative number is a special case, and has no proper inverse. It's its own absolute value / negative, or more properly those functions overflow. (Or you could treat the absolute value operation as having signed input and unsigned output, so the result is +128, i.e. 0x80. x86 doesn't have an integer abs instruction (prepare a -x
, then test/cmov), but with SSSE3 it does have vector integer pabsb)
For any known value in AL other than -1
, there's a cmp al, imm8
that will set OF. For any value from 0..127, cmp al, -128
wraps. For any value from -2..-128, cmp al, +127
wraps and thus sets OF. For -1
, subtracting 127 will only take you to -128. Subtracting -128 takes you up to +127. Unfortunately I don't think there's a single-instruction way to set OF without a known value in a register.
It doesn't have to be al
, but there's a 2-byte special encoding of cmp al,imm8
. Other 8 or 32-bit registers can use the normal 3-byte encoding.
Without clobbering any registers, and no known constants, this is 6 bytes:
push rax
xor eax,eax
cmp al, -128
pop rax
This does clobber the other condition codes, but it's faster than pushf
/popf
. Normally you can clobber something, though, or else you can't clobber the stack.
Toggle OF
setno al # OF=0 -> AL=1 OF=1 -> AL=0
cmp al, -127 # 1 - -127 = 128 = -128 0 - -127 = +127
来源:https://stackoverflow.com/questions/36798572/how-can-i-set-or-clear-overflow-flag-in-x86-assembly