Reducing JVM pause time > 1 second using UseConcMarkSweepGC

后端 未结 8 1954
名媛妹妹
名媛妹妹 2020-12-10 04:56

I\'m running a memory intensive app on a machine with 16Gb of RAM, and an 8-core processor, and Java 1.6 all running on CentOS release 5.2 (Final). Exact JVM details are:

相关标签:
8条回答
  • 2020-12-10 05:22

    Some places to start looking:

    • https://visualvm.dev.java.net/
    • http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstat.html
    • http://www.javaperformancetuning.com/tools/gcviewer/index.shtml

    Also I'd run the code through a profiler.. I like the one in NetBeans but there are other ones as well. You can view the gc behaviour in real time. The Visual VM does that as well... but I haven't run it yet (been looking for a reason to... but haven't had the time or the need yet).

    0 讨论(0)
  • 2020-12-10 05:23

    Turns out that part of the heap was getting swapped out to disk, so that garbage collection had to pull a bunch of data off the disk back into memory.

    I resolved this by setting Linux's "swappiness" parameter to 0 (so that it wouldn't swap data out to disk).

    0 讨论(0)
  • 2020-12-10 05:23

    Here are some things I have found which could be significant.

    • JSON-RPC can generate a lot of objects. Not as much as XML-RPC, but still something to watch for. In any case you do appear to be generating as much at 100 MB of objects per second which means your GC is running a high percentage of the time and is likely to be adding to your random latency. Even though the GC is concurrent, your hardware/OS is very likely to exhibit non-ideal random latency under load.
    • Have a look at your memory bank architecture. On Linux the command is numactl --hardware. If your VM is being split across more than one memory bank this will increase your GC times significantly. (It will also slow down your application as these accesses can be significantly less efficient) The harder you work the memory subsystem the more likely the OS will have to shift memory around (Often in large amounts) and you get dramatic pauses as a result (100 ms is not surprising). Don't forget your OS does more than just run your app.
    • Consider compacting/reducing the memory consumption of your cache. If you are using multiple GB of cache it is worth looking at ways to cut memory consumption further than you have already.
    • I suggest you profile your app with memory allocation tracing AND cpu sampling on at the same time. This can yield very different results and often points to the cause of these sort of problems.

    Using these approaches, the latency of an RPC call can be reduced to below 200 micro-second and the GC times reduced to 1-3 ms effecting less than 1/300 of calls.

    0 讨论(0)
  • 2020-12-10 05:24

    First, check out the Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning documentation, if you haven't already done so. This documentation says:

    the concurrent collector does most of its tracing and sweeping work with the application threads still running, so only brief pauses are seen by the application threads. However, if the concurrent collector is unable to finish reclaiming the unreachable objects before the tenured generation fills up, or if an allocation cannot be satisfied with the available free space blocks in the tenured generation, then the application is paused and the collection is completed with all the application threads stopped. The inability to complete a collection concurrently is referred to as concurrent mode failure and indicates the need to adjust the concurrent collector parameters.

    and a little bit later on...

    The concurrent collector pauses an application twice during a concurrent collection cycle.

    I notice that those GCs don't seem to be freeing very much memory. Perhaps many of your objects are long lived? You may wish to tune the generation sizes and other GC parameters. 10 Gig is a huge heap by many standards, and I would naively expect GC to take longer with such a huge heap. Still, 1 second is a very long pause time and indicates either something is wrong (your program is generating a large number of unneeded objects or is generating difficult-to-reclaim objects, or something else) or you just need to tune the GC.

    Usually, I would tell someone that if they have to tune GC then they have other problems they need to fix first. But with an application of this size, I think you fall into the territory of "needing to understand GC much more than the average programmer."

    As others have said, you need to profile your application to see where the bottleneck is. Is your PermGen too large for the space allocated to it? Are you creating unnecessary objects? jconsole works to at least show a minimum of information about the VM. It's a starting point. As others have indicated however, you very likely need more advanced tools than this.

    Good luck.

    0 讨论(0)
  • 2020-12-10 05:28

    I'd also suggest GCViewer and a profiler.

    0 讨论(0)
  • 2020-12-10 05:29

    A few things that I hope can help:

    I've never had a lot of luck with the ConcurrentCollector, in theory it sacrifices throughput for the gain of reduced latency, but I've found better luck with the throughput collector for both throughput and latency (with tuning, and for my apps).

    Your Cache of Soft References is a little bit of a dangerous idea for Generational Collectors, and is probably one reason why your young generation collections arn't collecting too much garbage.

    If I'm not mistaken, no matter how short-lived an Object is, if it gets put into the cache (which for sure made it into the Tenured Generation), it will be alive until a FullGC takes place, even if no other references to it exist!

    What this means is your objects that live in the young gen that are put into the cache are now copied multiple times, kept alive, keeping their references alive, and generally slowing down the youngGen GC.

    Its kind of paradoxical how caching can reduce object allocation but increaes GC time.

    You may also want to try adjusting your survivor ratio, it may be too small overflowing even more 'young' objects into the tenured generation.

    0 讨论(0)
提交回复
热议问题