问题
Why does this code work?
http://www.int80h.org/strlen/ says that the string address has to be in EDI
register for scasb
to work, but this assembly function doesn't seem to do this.
Assembly code for mystrlen
:
global mystrlen
mystrlen:
sub ecx, ecx
not ecx
sub al, al
cld
repne scasb
neg ecx
dec ecx
dec ecx
mov eax, ecx
ret
C main:
int mystrlen(const char *);
int main()
{
return (mystrlen("1234"));
}
Compilation:
nasm -f elf64 test.asm
gcc -c main.c
gcc main.o test.o
Output:
./a.out
echo $?
4
回答1:
The code from the question is 32 bit version of strlen, which works in 64b environment only partially, sort of "by accident" (as most of the SW works in reality, anyway ;) ).
One of the accidental effects of 64b environment is (in System V ABI, which is used by 64b linux OS, other 64b platforms may follow different calling convention, invalidating this!), that the first argument in function call is passed through rdi
register, and the scasb
is using es:rdi
in 64b mode, so this naturally fits together (as the Jester's answer says).
Rest of the 64b environment effects are less good, that code will return wrong value for 4+G long string (I know, highly unlikely to happen in practical usage, but can be tried by synthetic test providing such long string).
Fixed 64b version (also the end of routine exploits rax=0 to do both neg ecx
and mov eax,ecx
in single instruction):
global mystrlen
mystrlen:
xor ecx,ecx ; rcx = 0
dec rcx ; rcx = -1 (0xFFFFFFFFFFFFFFFF)
; rcx = maximum length to scan
xor eax,eax ; rax = 0 (al = 0 value to scan for)
repne scasb ; scan the memory for AL
sub rax,rcx ; rax = 0 - rcx_leftover = scanned bytes + 1
sub rax,2 ; fix that into "string length" (-1 for '\0')
ret
回答2:
The 64 bit sysv calling convention places the first argument into
rdi
. So the callermain
already did that load for you. You can examine its assembly code and see for yourself.
(Answer provided by Jester)
来源:https://stackoverflow.com/questions/42655541/why-does-this-repne-scasb-implementation-of-strlen-work