How do garbage collectors know about references on the stack frame?

后端 未结 3 1168
走了就别回头了
走了就别回头了 2021-02-04 13:37

What techniques do modern garbage collectors (as in CLR, JVM) use to tell which heap objects are referenced from the stack?

Specificall

相关标签:
3条回答
  • 2021-02-04 13:47

    In Java (and likely in the CLR although I know its internals less well), the bytecode is typed with object vs primitive information. As a result, there are data structures in the bytecode that describe which variables in each stack frame are objects and which are primitives. When the GC needs to scan the root set, it uses these StackMapTables to differentiate between references and non-references.

    CLR and Java have to have some mechanism like this because they are exact collectors. There are conservative collectors like the boehm collector that treat every offset on the stack as a possible pointer. They look to see if the value (when treated as a pointer) is an offset into the heap, and if so, they mark it as alive.

    0 讨论(0)
  • 2021-02-04 13:58

    Interesting documentation on this topic posted up by the .Net team shortly after the they made CoreCLR open source: Stack Walking

    0 讨论(0)
  • 2021-02-04 14:08

    Take a look at this Artima article from August 1996, Java's Garbage-Collected Heap; especially page 2.

    Any garbage collection algorithm must do two basic things. First, it must detect garbage objects. Second, it must reclaim the heap space used by the garbage objects and make it available to the program. Garbage detection is ordinarily accomplished by defining a set of roots and determining reachability from the roots. An object is reachable if there is some path of references from the roots by which the executing program can access the object. The roots are always accessible to the program. Any objects that are reachable from the roots are considered live. Objects that are not reachable are considered garbage, because they can no longer affect the future course of program execution.

    In a JVM the root set is implementation dependent but would always include any object references in the local variables. In the JVM, all objects reside on the heap. The local variables reside on the Java stack, and each thread of execution has its own stack. Each local variable is either an object reference or a primitive type, such as int, char, or float. Therefore the roots of any JVM garbage-collected heap will include every object reference on every thread's stack. Another source of roots are any object references, such as strings, in the constant pool of loaded classes. The constant pool of a loaded class may refer to strings stored on the heap, such as the class name, superclass name, superinterface names, field names, field signatures, method names, and method signatures.

    Any object referred to by a root is reachable and is therefore a live object. Additionally, any objects referred to by a live object are also reachable. The program is able to access any reachable objects, so these objects must remain on the heap. Any objects that are not reachable can be garbage collected because there is no way for the program to access them.

    The article continues to explore different garbage collection strategies, including reference counting collectors, tracing collectors, compacting collectors and copying collectors.


    Though this article is old, it still applies today; not much has really changed. There have been performance improvements to the different collection strategies, but no new major advancements.

    The Oracle HotSpot JVM, for example, has a new Garbage-First Garbage Collector which is a copying collector with performance tweaks for multi-core processors and large heap sizes (see this answer for more on the G1 Garbage Collector).

    0 讨论(0)
提交回复
热议问题