Where does local variables actually allocated within CLR?

别来无恙 提交于 2021-02-07 08:28:40

问题


I'm just going inside the CLR and IL and I'm confused by this thing.

I have the following C# code:

int x = 1;
object obj = x;
int y = (int)obj;

And IL disassemble for this

      // Code size       18 (0x12)
  .maxstack  1
  .locals init ([0] int32 x,
           [1] object obj,
           [2] int32 y)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0
  IL_0003:  ldloc.0
  IL_0004:  box        [mscorlib]System.Int32
  IL_0009:  stloc.1
  IL_000a:  ldloc.1
  IL_000b:  unbox.any  [mscorlib]System.Int32
  IL_0010:  stloc.2
  IL_0011:  ret

So, the ldloc.0 instruction "Loads the local variable at index 0 onto stack.". But where does the locals really stored and where does they loaded from. Because I thought that there are two places where the memory can be allocated: thread stack and heap. And variables should be stored in the stack.

Now, I suppose, that the stack is just an "evaluation stack", while the memory allocation for variables is an implementation detail and depend on platform and JIT compiler. And we actually can split the memory used by our programm on evaluation stack, managed heap, and locals allocated memory.

Is this true? Or there are some other mechanism here?


回答1:


You are severely conflating many things which are logically different:

  • Just because a variable is a local variable in C# does not mean that it is on the short term storage pool in IL. Local variables in C# can correspond to short term storage, long term storage, or evaluation stack in the corresponding IL.
  • Short term storage and evaluation stack in IL can correspond to stack or register storage in the jitted machine code.

When compiling C# to IL, the C# compiler makes locals members of a closure class -- they go on the long term storage pool -- when the lifetime of the local is possibly longer than the activation of the method. (Or when the activation of the method is broken up into little pieces, like it is in an async method.)

If the locals have short lifetimes then the compiler's optimizer chooses whether they go on the short term pool or the evaluation stack; the compiler calls the latter "ephemeral" locals. The algorithm for deciding when to make a local into an ephemeral is interesting; see the compiler source code for details.

The jitter then must decide whether to make short term pool variables and evaluation stack variables into stack locations or registers; it does so again using a complex optimization algorithm that varies depending on register availability and so on.

Finally, of course the C# compiler and the jitter are both free to reify an unread local as nothing at all; storage that is never read from need not actually be allocated.




回答2:


And IL disassemble for this

That's unoptimised code, typically produced by debug builds. Optimised code, typically produced by release builds would more likely be something like:

        // Code size 10 (0xD)
.maxstack 1
IL_0000:  ldc.i4.1    
IL_0001:  box         [mscorlib]System.Int32
IL_0006:  unbox.any   [mscorlib]System.Int32
IL_000B:  pop         
IL_000C:  ret  

The two most obvious differences between your version and mine are:

  1. Mine doesn't have the nop instruction. That "do nothing" instruction serves no purpose in the running code, but it does give a point to hang a breakpoint on in the IL if you put a breakpoint on the opening { of the C# you compiled from.
  2. Yours does some busy-work storing and loading copies of variables that mine doesn't bother with.

(This isn't the most optimal version possible, incidentally).

It's important to consider that not only does the treatment of C# locals to IL locals vary according to types of build, but that the same thing also applies to the jitting stage.

There are even bigger differences when it comes to something like:

public static void Stuff()
{
    int x = 2;
    Func<int> f = () => x * 2;
}

Here x and f are both locals in the C#, but in the IL there's actually a heap object with a field and a method.

Local mean different things depending on the context, both as an adjective and a noun.

In C# local means method arguments and local variables within methods, both as a noun and an adjective. They are generally allocated on "the stack" (though in the case of reference types that stack allocated variable is referring to a heap-allocated object) for several meanings of "the stack" (we'll come to that later) but not always (captured locals and locals within yielding or awaiting methods that are maintained between calls [and sometimes when they aren't] are two examples). Most of the time we don't need to think too much about this fact, but the few times when we do can tend to lead to an overemphasis on the concept in how we talk about it.

In IL local as a noun refers to a set of strongly-typed locations initialised at the start of the method. As an adjective it refers to both those locals and to the locations on the stack that we push to and pop from (in IL we do need to think about the stack, a lot). These are both local locations, in so much as they can be thought of as "near by" but only one of them is generally referred to as locals when we're talking CIL. (If we were talking theory more generally we might call all of them locals, or not, depending on the point of view we were talking theory from).

But where does the locals really stored and where does they loaded from.

It depends on what you really mean by "really". But consider what the reason we use stacks for is anyway; it's a handy (but not the only) way to implement methods calling methods. You put some values on the stack, along with information about where you are now, and move into that method. Then when it is done it you have any return value on the stack and can do the next thing.

And the IL locals are a chunk of space after the arguments and before the chunk of stack you are working with, just as is reflected by the way the IL is laid out; arguments, locals, pushing and popping.

And that's sort of how things work in the actual machine too, as we can most easily seen when it doesn't work:

public static void Overflow()
{
    Overflow();
    Overflow();
}

Call that and we get a StackOverflowException (I have it call itself twice because then tail-call optimisation can't turn that exception into just never returning, which is just about possible). Which means an actual real chunk of memory being used as a stack got all used up.

And it won't be surprising that the actual stack and the IL stack have a definite relationship and so the method arguments, locals (in the IL noun sense) and the values pushed and popped can all relate to values stored in that chunk of memory.

But they can also be implemented as registers in the CPU, so a local might never be in memory at all.

And they might also not even be there. Consider my release-build version of your code. Actually, let's make it a full C# method:

public static void DoStuff()
{
    int x = 1;
    object obj = x;
    int y = (int)obj;
}

Now let's have something call it:

public static void CallDoStuff()
{
    DoStuff();
}

So the compiler has turned DoStuff() into the code at the top of this answer, along with turning CallDoStuff() into:

call DoStuff
ret

We run our application and come to where CallDoStuff() is first called, so the jitter has to compile it. It very likely sees that DoStuff() is very small and (along with a few other factors weighing on this decision) doesn't produce a function call at all but inlines all of those instructions into the code it's producing for CallDoStuff. Then it may see that the unboxed int (y) isn't used so it can leave that out, which means it can leave boxing the int out, which means it can leave producing the int out, which means we don't need any actual code for x, obj and y at all.

In which case the answer at that level as to where the values "really" are, is "nowhere".



来源:https://stackoverflow.com/questions/47520657/where-does-local-variables-actually-allocated-within-clr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!