问题
Why the following assembly code is an anti-debugging tool?
l1:
call l3
l2:
;some code
l3:
mov al, 0c3h
mov edi, offset l3
or ecx, -1
rep stosb
I know that C3h is RETN
and I know that stobs
writes the value in al
as opcode according to the offset in edi
and it is done for ecx
times because of rep
.
I am also aware the fact that stobs
and stosw
will run if they were pre-fetched on intel architecture as their original format.
If we run the program in debugged mode the pre-fetch is irrelevant and the l2 label will run (because it is single-step) otherwise if there is no debugger it will be ping-pong between l1 and l3 am I right?
回答1:
When program is debugged (i.e. single step) prefetch queue is flushed at each step (when interrupt occurs). However, when executed normally that will not happen to rep stosb
. Older processors didn't flushed it even when there was memory write to the cached area, in order to support self-modifying code that was changed except rep movs
and rep stosb
. (IIRC it was eventually fixed in i7 processors.)
That's why if there is a debugger (single step) code will execute correctly and when rep stosb
is replaced by ret
l2
will be executed. When there is no debugger rep stosb
will continue, since ecx
is the biggest possible it will eventually write somewhere it is not supposed to write and an exception will occur.
This anti-debugging technique is described in this paper.
回答2:
The only thing that a debugger does here is add time delay. That may be the key to how this works. The Intel (and I assume the AMD) manual explicitly say that self-modifying code is not gauranteed "to work" unless the program signals the CPU that the cache line containg the modified instruction has changed. This is to make the prefetch logic cheap; the chip designers don't want to have hardware that continually tests that every byte of an instruction cacheline is still valid.
So I assume what happens with the debuggers is l1 calls l3, which stores a return after the rep stosb, and the return gets executed becaued of long delays induced by the debugger in single stepping, forcing the cachecline containng l3 to be refetched after changed.
Without the debugger, I'd guess the instruction (not shown) after the stosb gets executed. If it were a jump to "no debugger" then the success of the jump would demonstrate that no single-stepping debugger was being used.
If I found this code in an application, I would refuse to run it.
来源:https://stackoverflow.com/questions/10089273/why-does-this-code-enable-me-to-detect-a-debugger