How can I simulate a CALL instruction by using JMP?

问题

Like this but without the CALL instruction. I suppose that I should use JMP and probably other instructions.

PUSH 5
PUSH 4
CALL Function

回答1:

This is fairly easy to do. Push the return address onto the stack and then jump to the subroutine. The final code looks like this:

   PUSH 5
   PUSH 4
   PUSH offset label1
   jmp Function
 label1: ; returns here
   leas esp, 8[esp]

 Function: 
   ...
   ret

While this works, you really don't want to do this. On most modern processors, an on-chip call stack return address cache is kept, which pushes return addresses on a call, and pops return addresses on an RET. Being on the processor this has extremely short update/access times, which means the RET instruction can use the call-stack cache popped value to predict where the PC should go next, rather than waiting for the actual memory read from the memory location actually pointed to by ESP. If you do the "PUSH offset label1" trick, this cache does not get updated, and thus the RET branch prediction is wrong and the processor pipeline gets blown, having a severe negative impact on performance. (I think IBM has a patent on special instructions which are essentially "PUSHRETURNADDRESS k" and "POPRETURNADDESS", allowing this trick to be used on some of their CPUs. Alas, not on the x86.

回答2:

It depends on the situation. If the last thing your function does before returning is call another function, you can simply jump to that function. This is called tail call elimination, and is an optimization performed by many compilers. Example:

foo:
    call B
    call A
    ret

Tail call elimination replaces the last two lines with a single jump instruction:

foo:
    call B
    jmp  A

This works because the stack contains the return address of foo's caller. So when function A returns, it returns back to the function that called foo.

It you want execution to resume after the jump to A, push that address onto the stack before jumping:

foo:
    call B
    push offset bar
    jmp  A
bar:

However, I can think of no reason why anybody would want to do this.

回答3:

Before x86-64, call was the only instruction that could read EIP. (I guess int as well, but it doesn't put the result anywhere you can read from user-space).

So it's impossible to simulate call in position-independent code. In fact, 32-bit PIC code uses call to find out its own address.

But in x86-64, we have RIP-relative lea

    ... put function args in registers

    lea    rax, [rel ret_addr]    ; AT&T lea ret_addr(%rip), %rax
    push   rax
    jmp    call_target
ret_addr:

call itself internally decodes as push RIP / jmp target, where RIP during execution of an instruction = address of the end of that instruction = start of the next.

Of course this is normally terrible for performance, unbalancing the return-address predictor stack. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/. Use a normal call unless you want a ret to mispredict, e.g. for a retpoline or specpoline.

(A tailcall with just jmp is fine, collapsing a call/ret pair into a jmp, but pushing a new return address manually is always a problem.)

来源：https://stackoverflow.com/questions/21248227/how-can-i-simulate-a-call-instruction-by-using-jmp

标签

function

assembly

x86