Java process's memory grows indefinitely, but MemoryMXBean reports stable heap and non-heap size

删除回忆录丶 提交于 2020-12-30 05:10:51

问题


I am working with a team developing a Java GUI application running on a 1GB Linux target system.

We have a problem where the memory used by our java process grows indefinitely, until Linux finally kills the java process.

Our heap memory is healthy and stable. (we have profiled our heap extensively) We also used MemoryMXBean to monitor the application's non heap memory usage, since we believed the problem might lie there. However, what we see is that reported heap size + reported non heap size stays stable.

Here is an example of how the numbers might look when running the application on our target system with 1GB RAM (heap and non heap reported by MemoryMXBean, total memory used by Java process monitored using Linux's top command (resident memory)):

At startup:

  • 200 MB heap committed
  • 40 MB non heap committed
  • 320 MB used by java process

After 1 day:

  • 200 MB heap committed
  • 40 MB non heap committed
  • 360 MB used by java process

After 2 days:

  • 200 MB heap committed
  • 40 MB non heap committed
  • 400 MB used by java process

The numbers above are just a "cleaner" representation of how our system performs, but they are fairly accurate and close to reality. As you can see, the trend is clear. After a couple of weeks running the application, the Linux system starts having problems due to running out of system memory. Things start slowing down. After a few more hours the Java process is killed.

After months of profiling and trying to make sense of this, we are still at a loss. I feel it is hard to find information about this problem as most discussions end up explaining the heap or other non heap memory pools. (like Metaspace etc.)

My questions are as follows:

  1. If you break it down, what does the memory used by a java process include? (in addition to the heap and non heap memory pools)

  2. Which other potential sources are there for memory leaks? (native code? JVM overhead?) Which ones are, in general, the most likely culprits?

  3. How can one monitor / profile this memory? Everything outside the heap + non heap is currently somewhat of a black box for us.

Any help would be greatly appreciated.


回答1:


I'll try partially answer your question.

The basic strategy I'm trying to stick to in such situations is to make a monitoring of max/used/peak values for each memory pool available, opened files, sockets, buffer pools, number of threads, etc. There might be a leakage of socket connections/opened files/threads which you can miss.

In your case it looks like you are really have a problem with native memory leakage which is quite nasty and hard to find.

You may try to profile memory. Take a look at GC roots and find out which ones is JNI global references. It may help you to find out which classes may be not collected. For example this is a common problem in awt which may require explicit component disposal.

To inspect JVM internal memory usage (which is not belongs to heap/off-heap memory) -XX:NativeMemoryTracking is very handy. It allows you to track thread stack sizes, gc/compiler overheads and much more. The greatest thing about it is that you can create a baseline in any point of time and then track memory diffs since baseline was made

# jcmd <pid> VM.native_memory baseline
# jcmd <pid> VM.native_memory summary.diff scale=MB

Total:  reserved=664624KB  -20610KB, committed=254344KB -20610KB
...

You can also use JMX com.sun.management:type=DiagnosticCommand/vmNativeMemory command to generate this reports.

And... You can go deeper and inspect pmap -x <pid> and/or procfs content.




回答2:


We finally seem to have identified the root cause of the problem we had. This is a answer to what specifically caused that problem, as it's not unlikely this may be of use for others.

TLDR:

The problem was caused by a bug in the JDK which is now fixed and will realease with JDK 8u152. Link to the bug report

The whole story:

We continued to monitor our application's memory performance after I first posted this question, and the XX:NativeMemoryTracking suggested by vsminkov helped greatly with narrowing down and pinpointing the area in memory which was leaking.

What we found was that the "Tread - Arenas" area was growing indefinitely. As this kind of leak was something we were pretty sure we hadn't experienced earlier, we started testing with earlier versions of java to see if this was introduced at some specific point.

After going back down to java 8u73 the leak wasn't there, and although being forced to use an older JDK version wasn't optimal, at least we had a way to get around the problem for now.

Some weeks later, while running on update 73, we noticed that the application was still leaking, and once again we started searching for a culprit. We found the problem was now located in the Class - malloc area.

At this point we were almost certain the leak was not our application's fault, and we were considering contacting Oracle to report the issue as a potential bug, but then a colleague of mine stumbled across this bug report on the JDK hotspot compiler: Link to the bug report

The bug description is very similar to what we saw. According to what is written in the report, the memory leak has been present since the java 8 release, and after testing with an early release of JDK 8u152 we are now fairly certain the leak is fixed. After 5 days of running, our application's memory footprint now seems close to 100% stable. The class malloc area still grows slightly, but it's now going up at a rate of about 100 KB a day (compared to several MBs earlier), and having tested for only 5 days I can't say for sure it won't stabilize eventually.

I can't say for certain, but it seems likely the issues we had with the Class malloc and Thread arenas growing were related. At any point, both problems are gone in update 152. Unfortunately, the update doesn't seem to be scheduled for official release until late 2017, but our testing with the early release seems promising so far.



来源:https://stackoverflow.com/questions/39117716/java-processs-memory-grows-indefinitely-but-memorymxbean-reports-stable-heap-a

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!