Java G1 garbage collection in production

非 Y 不嫁゛ 提交于 2019-11-26 23:35:08

It sounds like the point of G1 is to have smaller pause times, even to the point where it has the ability to specify a maximum pause time target.

Garbage collection isn't just a simple "Hey, it's full, let's move everything at once and start over" deal any more--it's fantastically complex, multi-level, background threaded system. It can do much of its maintenance in the background with no pauses at all, and it also uses knowledge of the system's expected patterns at runtime to help--like assuming most objects die right after being created, etc.

I would say GC pause times are going to continue to improve, not worsen, with future releases.

EDIT:

in re-reading it occurred to me that I use Java daily--Eclipse, Azureus, and the apps I develop, and it's been a LONG TIME since I saw a pause. Not a significant pause, but I mean any pause at all.

I've seen pauses when I right-click on windows explorer or (occasionally) when I hook up certain USB hardware, but with Java---none at all.

Is GC still an issue with anyone?

I've been testing it out with a heavy application: 60-70GB allocated to heap, with 20-50GB in use at any time. With these sorts of applications, it's an understatement to say that your mileage may vary. I'm running JDK 1.6_22 on Linux. The minor versions are important-- before about 1.6_20, there were bugs in G1 that caused random NullPointerExceptions.

I've found that it is very good at keeping within the pause target you give it most of the time. The default appears to be a 100ms (0.1 second) pause, and I've been telling it to do half that (-XX:MaxGCPauseMillis=50). However, once it gets really low on memory, it panics and does a full stop-the-world garbage collection. With 65GB, that takes between 30 seconds and 2 minutes. (The number of CPUs probably doesn't make a difference; it's probably limited by the bus speed.)

Compared with CMS (which is not the default server GC, but it should be for web servers and other real-time applications), typical pauses are much more predictable and can be made much shorter. So far I'm having better luck with CMS for the huge pauses, but that may be random; I'm seeing them only a few times every 24 hours. I'm not sure which one will be more appropriate in my production environment at the moment, but probably G1. If Oracle keeps tuning it, I suspect G1 will ultimately be the clear winner.

If you're not having a problem with the existing garbage collectors, there's no reason to consider G1 right now. If you are running a low-latency application, such as a GUI application, G1 is probably the right choice, with MaxGCPauseMillis set really low. If you're running a batch-mode application, G1 doesn't buy you anything.

Although I have not tested G1 in production, I thought I would comment that GCs are already problematic for cases without "humongous" heaps. Specifically services with just, say, 2 or 4 gigs can be severely impacted by GC. Young generation GCs are usually not problematic as they finish in single-digit milliseconds (or at most double-digit). But old-generation collections are much more problematic as they take multiple seconds with old-gen sizes of 1 gig or above.

Now: in theory CMS can help a lot there, as it can run most of its operation concurrently. However, over time there will be cases where it can not do this and has to fall back to "stop the world" collection. And when that happens (after, say, 1 hour -- not often, but still too often), well, hold on to your f***ing hats. It can take a minute or more. This is especially problematic for services that try to limit maximum latency; instead of it taking, say, 25 milliseconds to serve a request it now takes ten second or more. To add injury to insult clients will then often time out the request and retry, leading to further problems (aka "shit storm").

This is one area where G1 was hoped to help a lot. I worked for a big company that offers cloud services for storage and message dispatching; and we could not use CMS since although much of the time it worked better than parallel varieties, it had these meltdowns. So for about an hour things were nice; and then stuff hit the fan... and because service was based on clusters, when one node got in trouble, others typically followed (since GC-induced timeouts lead to other nodes believe node had crashed, leading to re-routes).

I don't think GC is that much of a problem for apps, and perhaps even non-clustered services are less often affected. But more and more systems are clustered (esp. thanks to NoSQL data stores) and heap sizes are growing. OldGen GCs are super-linearly related to heap size (meaning that doubling heap size more than doubles GC time, assuming size of live data set also doubles).

Azul's CTO, Gil Tene, has a nice overview of the problems associated with Garbage Collection and a review of various solutions in his Understanding Java Garbage Collection and What You Can Do about It presentation, and there's additional detail in this article: http://www.infoq.com/articles/azul_gc_in_detail.

Azul's C4 Garbage Collector in our Zing JVM is both parallel and concurrent, and uses the same GC mechanism for both the new and old generations, working concurrently and compacting in both cases. Most importantly, C4 has no stop-the-world fall back. All compaction is performed concurrently with the running application. We have customers running very large (hundreds of GBytes) with worse case GC pause times of <10 msec, and depending on the application often times less than 1-2 msec.

The problem with CMS and G1 is that at some point Java heap memory must be compacted, and both of those garbage collectors stop-the-world/STW (i.e. pause the application) to perform compaction. So while CMS and G1 can push out STW pauses, they don't eliminate them. Azul's C4, however, does completely eliminate STW pauses and that's why Zing has such low GC pauses even for gigantic heap sizes.

And to correct a statement made in an earlier answer, Zing does not require any changes to the Operating System. It runs just like any other JVM on unmodified Linux distros.

We are already using G1GC, from almost two years. Its doing great in our mission critical transaction processing system, and It proved to be a great support w.r.t high throughput, low pauses, concurrency and optimized heavy memory management.

We are using following JVM settings:

-server -Xms512m -Xmx3076m -XX:NewRatio=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -XX:+AggressiveOpts -XX:+UnlockExperimentalVMOptions -XX:MaxGCPauseMillis=400 -XX:GCPauseIntervalMillis=8000 -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime

Updated

-d64 -server -Xss4m -Xms1024m -Xmx4096m -XX:NewRatio=50 -XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -XX:+HeapDumpOnOutOfMemoryError -XX:-DisableExplicitGC -XX:+AggressiveOpts -Xnoclassgc -XX:+UseNUMA -XX:+UseFastAccessorMethods -XX:ReservedCodeCacheSize=48m -XX:+UseStringCache -XX:+UseStringDeduplication -XX:MaxGCPauseMillis=400 -XX:GCPauseIntervalMillis=8000

The G1 collector reduces the impact of full collections. If you have an application where you have already reduced the need for full collections, the Concurrent map Sweep collector is just as good and in my experience has shorter minor collection times.

It seems like G1 starting JDK7u4 is finally officially supported, see the RN for JDK7u4 http://www.oracle.com/technetwork/java/javase/7u4-relnotes-1575007.html.

From our testing still for big JVMs tuned CMS still acts better than G1 but I guess it will grow better.

CMS can lead to slowly degraded performance even if you are running it without accumulating tenured objects. This is because of memory fragmentation which G1 supposedly avoids.

The myth about G1 available only with paid support is just that, a myth. Sun and now Oracle have clarified this on the JDK page.

G1 GC is supposed to work better. But if setting -XX:MaxGCPauseMillis too aggressively, garbage will be collecting too slowly. And that's why full GC triggered in David Leppik's example.

I have just implemented G1 Garbage Collector in our Terracotta Big Memory project. While working on different types of collectors G1 gave us the best results with less than 600ms respond time.

You can find the test results (26 in total) here

Hope it helps.

Ravindra babu

Recently I have been moved from

CMS to G1GC with 4G heap & 8 core processor on servers with JDK 1.7.45.

(JDK 1.8.x G1GC is preferred over 1.7 but due to some limitations, I have to stick to 1.7.45 version)

I have configured below key parameters and kept all other parameters to default values.

-XX:G1HeapRegionSize=n, XX:MaxGCPauseMillis=m, -XX:ParallelGCThreads=n, 
-XX:ConcGCThreads=n apart from -Xms and -Xmx

If you want to fine tune these parameters, have a look at this oracle article.

Key observations:

  1. Memory usage is consistent with G1GC unlike high & lows with CMS
  2. Max GC pause time is less compared to CMS
  3. Time spent in Garbage collection is little bit high in G1GC compared to CMS.
  4. Number of major collections are almost negligible compared to CMS
  5. Number of minor collections are on higher end compared to CMS

But still I am happy that Max GC pause time is less than as of CMS. I have set Max GC pause time as 1.5 seconds and this value has not been crossed yet.

Related SE question:

Java 7 (JDK 7) garbage collection and documentation on G1

I have recently migrated part of Twicsy to a new server with 128GB RAM and decided to use 1.7. I started off using all the same memory settings as I used with 1.6 (I have several instances running doing various things, anywhere from 500mb of heap up to 15GB, and now a new one with 40GB) and that did not work out well at all. 1.7 seems to use more heap than 1.6, and I experienced a lot of issues over the first few days. I luckily had plenty of RAM to work with and bumped up the RAM for most of my processes, but still was having some issues. My normal MO was to use a very small minimum heap size of 16m, even with a max heap of several gigabytes, then turn on incremental GC. This kept the pauses at a minimum. That doesn't work now though, and I had to increase the minimum size to about what I expected to use on average in the heap, and that has worked out very well. I still have incremental GC turned on, but I will be trying it without. No pauses whatsoever now, and things seem to be running very fast. So, I think the moral of the story is don't expect your memory settings to translate perfectly from 1.6 to 1.7.

G1 makes the application a lot more agile: the latancy of the application will raise - the app can be named as "soft-real-time". This is done by replacing two kinds of GC runs (small minor ones and one big on Tenured Gen) to equal-sized small ones.

For more details look at this: http://geekroom.de/java/java-expertise-g1-fur-java-7/

I'm working with Java, for small and large Heap, and the question of the GC and Full GC appears every day, as the constraints may be more strict than others : in certain environment, 0.1 second of scavenger GC or Full GC, kill simply the fonctionnalité, and have fine grained configuration and capability is important (CMS, iCMS, others ... the target is here to have the best possible response time with the nearly real time treatment (here the real time treatment is often 25 ms ), so, basically, any improvements in GC ergonomy ans heuristique are welcome !

I use G1GC on Java 8 and also with Groovy (also Java 8), and I am doing various kinds of workloads, and overally G1GC works like this:

  • The memory usage is very low, e.g. 100MB instead of 500MB compared to default Java settings

  • The response time is consistent and very low

  • The performance between default settings and G1GC is 20% slowdown when using G1GC in worst case scenario (without tuning, single-threaded application). It's not much considering good response time and low memory usage.

  • When running from Tomcat which is multi-threaded, the overall performance is 30% better and memory usage is much lower as well response times are much lower.

So in overall, when using really various workloads, G1GC is very good collector for Java 8 for multi-threaded applications, and even for single-threaded there are some benefits.

It's not suggested to use java8 w/ G1GC for float point calculation with hotspot-like JVM. It's dangerous for application integrity & accuracy.

https://bugs.openjdk.java.net/browse/JDK-8148175

JDK-8165766

JDK-8186112

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!