How does the C# garbage collector find objects whose only reference is an interior pointer?

前端 未结 3 1585
感动是毒
感动是毒 2020-12-14 03:09

In C#, ref and out params are, as far as I know, passed by passing only the raw address of the relevant value. That address may be an interior poin

相关标签:
3条回答
  • 2020-12-14 03:42

    Your code compiles to

        IL_0001: newobj instance void Foo::.ctor()
        IL_0006: ldflda int32 Foo::'field'
        IL_000b: call void Foo::Increment(int32&)
    

    AFAIK, the ldflda instruction creates a reference to the object containing the field, for as long as the address is on the stack (until the call completes).

    0 讨论(0)
  • 2020-12-14 03:57

    The garbage collector will have a fast way to find the start of an object from a managed interior pointer. From there it can obviously mark the object as "referenced" when doing the sweeping phase.

    Don't have the code for the Microsoft collector but they would use something similar to Go's span table which has a fast lookup for different "spans" of memory which you can key on the most significant X bits of the pointer depending on how large you choose the spans to be. From there they use the fact that each span contains X number of objets of the same size to very quickly find the header of the one you have. It's pretty much an O(1) operation. Obviously the Microsoft heap will be different since it's allocated sequentially without regard for object size but they will have some sort of O(1) lookup structure.

    https://github.com/puppeh/gcc-6502/blob/master/libgo/runtime/mgc0.c

    // Otherwise consult span table to find beginning.
    // (Manually inlined copy of MHeap_LookupMaybe.)
    k = (uintptr)obj>>PageShift;
    x = k;
    x -= (uintptr)runtime_mheap.arena_start>>PageShift;
    s = runtime_mheap.spans[x];
    if(s == nil || k < s->start || (const byte*)obj >= s->limit || s->state != MSpanInUse)
        return false;
    p = (byte*)((uintptr)s->start<<PageShift);
    if(s->sizeclass == 0) {
        obj = p;
    } else {
        uintptr size = s->elemsize;
        int32 i = ((const byte*)obj - p)/size;
        obj = p+i*size;
    }
    

    Note that the .NET garbage collector is a copying collector so managed/interior pointers need to be updated whenever the object is moved during a garbage collection cycle. The GC will be aware of where in the stack interior pointers are for each stack frame based on the method parameters known at JIT time.

    0 讨论(0)
  • 2020-12-14 04:04

    The garbage collector works in three basic steps:

    1. Mark all objects that are still alive.
    2. Collect the objects that are not marked as alive.
    3. Compact the memory.

    Your concern is step 1: How does the GC figure out that it shouldn't collect objects behind ref and out params?

    When the GC performs a collection, it starts with a state where none of the objects is considered alive. It then goes from the root references and marks all those objects as alive. Root references are all references on the stack and in static fields. Then the GC goes recursively into the marked objects and marks all objects as alive that are referenced from them. This is repeated until no objects are found that are not already marked as alive. The result of this operation is an object graph.

    A ref or out parameter has a reference on the stack, and so the GC will mark the respective object as alive, because the stack is a root for the object graph.

    At the end of the process, the objects with only internal references are not marked, because there is no path from the root references that would reach them. This takes care of all circular references, too. These objects are considered dead and will be collected in the next step (that includes calling the finalizer, even though there is no guarantee for that).

    At the end, the GC will move all alive objects to a continuous area of memory at the beginning of the heap. The rest of the memory will filled with zeroes. That simplifies the process of creating new objects, since their memory can always be allocated at the end of the heap and all fields already have the default values.

    It is true that the GC needs some time to do all of this, but it still does it reasonably fast, due to some optimizations. One of the optimizations is to separate the heap into generations. All newly allocated objects are generation 0. All objects surviving the first collection are generation 1 and so forth. Higher generations are only collected if collecting lower generations does not free up enough memory. So, no, the GC does not always have to scan the entire heap.

    You have to consider that, while the collection takes some time, allocating new objects (which happens much more often than a garbage collection) is much faster than in other implementations, where the heap looks more like a swiss cheese and you need some time to find a hole big enough for the new object (which you still need to initialize).

    0 讨论(0)
提交回复
热议问题