How to detect the cause of OutofMemoryError?

后端 未结 5 1873
慢半拍i
慢半拍i 2021-02-08 11:44

I have a complaint that a server application of mine is crashing on high load.
It is a web app running in Tomcat 5.
I see the thread dumps and I see that th

相关标签:
5条回答
  • 2021-02-08 11:58

    Fix

    Followed the IBM technote java.lang.OutOfMemoryError while creating new threads, specifically 'ulimit' command to increase the value from the default 1024.

    Symptom

    [2/25/15 12:47:34:629 EST] 00000049 SystemErr     R java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11
    [2/25/15 12:47:34:630 EST] 00000049 SystemErr     R     at java.lang.Thread.startImpl(Native Method)
    [2/25/15 12:47:34:630 EST] 00000049 SystemErr     R     at java.lang.Thread.start(Thread.java:936)
    [2/25/15 12:47:34:630 EST] 00000049 SystemErr     R     at org.eclipse.osgi.framework.internal.core.InternalSystemBundle.stop(InternalSystemBundle.java:251)
    [2/25/15 12:47:34:630 EST] 00000049 SystemErr     R     at com.ibm.ws.runtime.component.RuntimeBundleActivator.shutdownEclipse(RuntimeBundleActivator.java:54)
    [2/25/15 12:47:34:630 EST] 00000049 SystemErr     R     at com.ibm.ws.runtime.component.ServerCollaborator$ShutdownHook$1.run(ServerCollaborator.java:878)
    [2/25/15 12:47:34:630 EST] 00000049 SystemErr     R     at com.ibm.ws.security.auth.ContextManagerImpl.runAs(ContextManagerImpl.java:5459)
    [2/25/15 12:47:34:631 EST] 00000049 SystemErr     R     at com.ibm.ws.security.auth.ContextManagerImpl.runAsSystem(ContextManagerImpl.java:5585)
    [2/25/15 12:47:34:631 EST] 00000049 SystemErr     R     at com.ibm.ws.runtime.component.ServerCollaborator$ShutdownHook.run(ServerCollaborator.java:850)
    [2/25/15 12:47:34:631 EST] 00000049 SystemErr     R     at com.ibm.ws.runtime.component.ServerCollaborator$StopAction.alarm(ServerCollaborator.java:809)
    [2/25/15 12:47:34:631 EST] 00000049 SystemErr     R     at com.ibm.ejs.util.am._Alarm.run(_Alarm.java:133)
    [2/25/15 12:47:34:631 EST] 00000049 SystemErr     R     at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1815)
    

    Environment

    CentOS 6.6 64 bit IBM WAS 8.5.0.2 64 bit

    References

    • Guidelines for setting ulimits (WebSphere Application Server)
    0 讨论(0)
  • 2021-02-08 11:59

    According to this post:

    There are two possible causes of the java.lang.OutOfMemoryError: Failed to create a thread message:

    • There are too many threads running and the system has run out of internal resources to create new threads.
    • The system has run out of native memory to use for the new thread. Threads require a native memory for internal JVM structures, a Java™ stack, and a native stack.

    So this error may well be completely unrelated to memory, just that too many threads are created...

    EDIT:

    As you've got 695 threads, you would need 695 times the stack size as memory. Considering this post on thread limits, I suspect that you are trying to create too many threads for the available virtual memory space.

    0 讨论(0)
  • 2021-02-08 12:13

    A similar message on IBM WebSphere shows this line

    "Failed to create a thread: retVal"

    as indicative of a native OOM which means some thread (of the process) is trying to request a large portion of memory on the heap.

    The IBM link above has a series of steps - some of which are IBM specific. Have a look.

    From a native memory usage perspective:

    • Maximum Java heap settings
    • JDBC driver
    • JNI code or Native libraries
    • garbage collection of unused classes. Ensurethat -Xnoclassgc is not set.
    • Thread Pool settings (fixed size thread pools)
    • Too many classloaders etc, but these are not very common.
    • Number of classes/classloaders from javacores.

    Another thing you could look at is the PermGenSpace - how large is it?

    This link http://www.codingthearchitecture.com/2008/01/14/jvm_lies_the_outofmemory_myth.html suggests

    Increasing the heap allocation actually exacerbates this problem! It decreases the headroom the compiler, and other native components, have to play with. So the solution to my problem was: 1.reduce the heap allocated to the JVM. 2. remove the memory leaks caused by native objects not being freed in a timely fashion.

    Also have you configured a value in server.xml for maxThreads ? The default is 200 but your app seems to have 695 ?

    0 讨论(0)
  • 2021-02-08 12:15

    You should start the JVM with the -XX:+HeapDumpOnOutOfMemoryError flag. This will produce a heap dump when the OutOfMemoryError is generated.

    Then, as @Steve said, you can use a tool like MAT to analyze the dump and see which objects are allocated, and who is keeping references to them. This usually will give you some insight on why your JVM is exhausting its memory.

    0 讨论(0)
  • 2021-02-08 12:18

    I know what you mean, it can be confusing to find somewhere to begin.

    Have a look at Eclipse Memory Analyzer (MAT). It will use JHat to dump a memory snapshot of your program into a file, which you can re-open and analyze.

    The browser for this file outlines all the objects created by the program very neatly, and you can look into various levels to find if something is suspicious.


    Appending my comments to answer...

    Right when your executable webapp crashes, dump it to MAT. MAT will tell you what object is being created a bunch of times. If it's a custom object, and it often is, it's easy to find. If not, you can see its parent, amputate it, and dribble down from there (sorry for the graphic example, I'm not entirely focused on SO at the moment :).

    Oh, and I forgot to mention, you can run the program several times under several conditions, and make a dump each time. Then, you can analyze each dump for the trend.


    But in my case what should I use?I have a web app running in Tomcat

    Sorry, missed this too. If I'm not mistaken, MAT dumps the JVM process, so as long as the VM is running on your box you can dump its process and see what's going on.


    Another comment mutated into partial solution...

    This is becoming more difficult than it actually is. Seriously, it's pretty easy, after you run MAT once or twice to get the hang of things. Run your app until the thing crashes. Dump it. Change something. Run, crash, dump. Repeat. Then, open the dumps in MAT, and compare what looks suspicious.

    The trickiest part when I was learning this was finding the process ID to dump - which is still not too mind-numbing.

    0 讨论(0)
提交回复
热议问题