The call stack does not say “where you came from”, but “where you are going next”?

后端 未结 5 1164
南旧
南旧 2020-11-30 03:51

In a previous question (Get object call hierarchy), I got this interesting answer:

The call stack is not there to tell you where you came from.

相关标签:
5条回答
  • 2020-11-30 03:54

    Consider the following code:

    void Main()
    {
        // do something
        A();
        // do something else
    }
    
    void A()
    {
        // do some processing
        B();
    }
    
    void B()
    {
    }
    

    Here, the last thing the function A is doing is calling B. A immediately returns after that. A clever optimizer might optimize out the call to B, and replace it with just a jump to B's start address. (Not sure whether current C# compilers do such optimizations, but almost all C++ compilers do). Why would this work? Because there's an address of the A's caller in the stack, so when B finishes, it would return not to A, but directly to A's caller.

    So, you can see that the stack does not necessary contain the information about where did the execution come from, but rather where it should go to.

    Without optimization, inside B the call stack is (I omit the local variables and other stuff for clarity):

    ----------------------------------------
    |address of the code calling A         |
    ----------------------------------------
    |address of the return instruction in A|
    ----------------------------------------
    

    So the return from B returns to A and immediately quits `A.

    With the optimization, the call stack is just

    ----------------------------------------
    |address of the code calling A         |
    ----------------------------------------
    

    So B returns directly to Main.

    In his answer, Eric mentions another (more complicated) cases where the stack information doesn't contain the real caller.

    0 讨论(0)
  • 2020-11-30 03:55

    I think he's trying to say that it tells the Called method where to go next.

    • Method A calls Method B.
    • Method B completes, where does it go next?

    It Pops the callee methods address off the top of the Stack and then goes to there.

    So Method B knows where to go after it completes. Method B, doesn't really care where it came from.

    0 讨论(0)
  • 2020-11-30 03:57

    What Eric is saying in his post is that the execution pointer does not need to know where it has come from, only where it has to go when the current method ends. These two things superficially would seem to be the same thing, but if the case of (for instance) tail recursion where we came from and where we are going next can diverge.

    0 讨论(0)
  • 2020-11-30 04:16

    You've explained it yourself. The "return address" by definition tells you where you are going next.

    There is no requirement whatsoever that the return address that is put on the stack is an address inside the method that called the method you're in now. It typically is, which sure makes it easier to debug. But there is not a requirement that the return address be an address inside the caller. The optimizer is permitted to -- and sometimes does -- muck with the return address if doing so makes the program faster (or smaller, or whatever it is optimizing for) without changing its meaning.

    The purpose of the stack is to make sure that when this subroutine finishes, it's continuation -- what happens next -- is correct. The purpose of the stack is not to tell you where you came from. That it usually does so is a happy accident.

    Moreover: the stack is just an implementation detail of the concepts of continuation and activation. There is no requirement that both concepts be implemented by the same stack; there could be two stacks, one for activations (local variables) and one for continuation (return addresses). Such architectures are obviously much more resistant to stack smashing attacks by malware because the return address is nowhere near the data.

    More interestingly, there is no requirement that there be any stack at all! We use call stacks to implement continuation because they are convenient for the kind of programming we typically do: subroutine-based synchronous calls. We could choose to implement C# as a "Continuation Passing Style" language, where the continuation is actually reified as an object on the heap, not as a bunch of bytes pushed on a million byte system stack. That object is then passed around from method to method, none of which use any stack. (Activations are then reified by breaking each method up into possibly many delegates, each of which is associated with an activation object.)

    In continuation passing style there simply is no stack, and no way at all to tell where you came from; the continuation object does not have that information. It only knows where you are going next.

    This might seem to be a highfalutin theoretical mumbo jumbo, but we essentially are making C# and VB into continuation passing style languages in the next version; the coming "async" feature is just continuation passing style in a thin disguise. In the next version, if you use the async feature you will essentially be giving up stack-based programming; there will be no way to look at the call stack and know how you got here, because the stack will frequently be empty.

    Continuations reified as something other than a call stack is a hard idea for a lot of people to get their minds around; it certainly was for me. But once you get it, it just clicks and makes perfect sense. For a gentle introduction, here are a number of articles I've written on the subject:

    An introduction to CPS, with examples in JScript:

    http://blogs.msdn.com/b/ericlippert/archive/2005/08/08/recursion-part-four-continuation-passing-style.aspx

    http://blogs.msdn.com/b/ericlippert/archive/2005/08/11/recursion-part-five-more-on-cps.aspx

    http://blogs.msdn.com/b/ericlippert/archive/2005/08/15/recursion-part-six-making-cps-work.aspx

    Here are a dozen articles that start by doing a deeper dive into CPS, and then explain how this all works with the coming "async" feature. Start from the bottom:

    http://blogs.msdn.com/b/ericlippert/archive/tags/async/

    Languages that support continuation passing style often have a magic control flow primitive called "call with current continuation", or "call/cc" for short. In this stackoverflow question, I explain the trivial difference between "await" and "call/cc":

    How could the new async feature in c# 5.0 be implemented with call/cc?

    To get your hands on the official "documentation" (a bunch of white papers), and a preview release of C# and VB's new "async await" feature, plus a forum for support Q&A, go to:

    http://msdn.com/vstudio/async

    0 讨论(0)
  • 2020-11-30 04:18

    There is more to this than you think.

    In C it is entirely possible to have a program rewrite the call stack. Indeed, that technique is the very basis of a style of exploit known as return oriented programming.

    I've also written code in one language which gave you direct control over the callstack. You could pop off the function that called yours, and push some other one in its place. You could duplicate the item on the top of the call stack, so the rest of the code in the calling function would get executed twice, and a bunch of other interesting things. In fact direct manipulation of the call stack was the primary control structure provided by this language. (Challenge: can anybody Identify the language from this description?)

    It did clearly show that the call stack indicates where you are going, not where you have been.

    0 讨论(0)
提交回复
热议问题