Does JVM collection times increase exponentially with JVM RAM size?

问题

I heard an associate say:

JVM garbage collection times increase exponentially with JVM size. This is because the tree of references is a function of the amount of the amount of objects to allocate - and gets exponentially harder to traverse the tree as the number of objects get bigger.

This sounded right.

I heard another associate say:

JVM garbage collection on the same machine is linear. Given an 8GB JVM split in two 4G JVMs on the same machine (via microservices) will have the same garbage collection durations because the same OS is slowing you down for the same number of objects.

This didn't seem right - as the trees of objects on the two smaller JVMs should be shallower and easier to traverse.

My question is: Does JVM collection times increase exponentially with JVM RAM size?

Assumption: Oracle JVM used.

回答1:

While Holgers explanation is correct I would like to put a slightly different aspect to it. The time a GC takes is directly proportional to the number of live objects in the live set. This is easily demonstrated. Assume that we have two applications with heaps of the same size. In the first heap we allocate 10 objects of 100 MB each and in the second 10 million of 100 bytes each. At the next gc half of the objects in each application are unreachable (dead) and can be collected.

It is self-evident that it will take longer to trace the graph with the most objects.

(As an aside, I remember reading a measurement of the 'shallow and wide' vs 'deep and narrow' and that there was no perceptible difference but I can't remember where. @Holger: if you have a source I would love to read it)

Note that following established java coding practices will in fact ensure that the live set is small. The JVM expects you to code that way and goes to pretty great lengths to help keeping the live set small, escape analysis being just one trick up Hot Spots sleeve.

So, in short: NO

回答2:

There is no such simple dependency.

First of all, considering the “garbage collection” as function over object references is obviously only referring to the marking phase, ignoring the costs of allocation, deallocation or copying/moving objects. The marking costs depend on the number of live references that have to be traversed, neither dead objects or unused memory have any impact on it. Therefore, just giving more RAM to the same application isn’t necessarily changing the garbage collection performance at all.

There is a tendency to use whatever amount of RAM you give the JVM, so providing more RAM may make garbage collection cycles less frequent, but perhaps needing more time to mark all live objects. But since having more time between garbage collections raises the chances of objects to become unused, the marking costs usually don’t scale by the same factor as the time between the collections.

It’s easy to prove that it is actually the other way round in practice. Just take an arbitrary Java application and reduce the available memory down to the point where it barely runs without encountering an OutOfMemoryError. You will see how providing less RAM makes it slower, dramatically slower the closer you get to that point. On the other hand, there is actually no need for a prove that giving an application so much RAM that it never needs a garbage collection during its lifetime has the smallest costs.

When we look at the marking phase only, without considering how often it happens, and only consider how it scales with the number of live references, there is still no reason why it should be exponential. Object references may form an arbitrary graph that rarely is a tree. Further, the garbage collector doesn’t need to traverse every object reference. It only needs to traverse references to objects it has not encountered before (guess why it is called “marking”) which implies that the number of references that needs to be traversed is identical to the number of live objects. There might be some costs to find out that a reference doesn’t need to be traversed, but this is still a linear overhead.

JVMs like HotSpot (it’s not a property of Sun anymore) use generational garbage collection and card marking, to only traverse references of new objects and old objects whose memory section (card) has changed, instead of all live objects. Since both, changing old objects and creating new objects, requires CPU time, it doesn’t directly scale with the available RAM.

来源：https://stackoverflow.com/questions/48779855/does-jvm-collection-times-increase-exponentially-with-jvm-ram-size

标签

performance

garbage-collection

jvm