Getting the Java thread id and stack trace of run-away Java thread

本小妞迷上赌 提交于 2019-11-27 13:00:59

问题


On my busiest production installation, on occasion I get a single thread that seems to get stuck in an infinite loop. I've not managed to figure out who is the culprit, after much research and debugging, but it seems like it should be possible. Here are the gory details:

Current debugging notes:

1) ps -eL 18975 shows me the the Linux pid the problem child thread, 19269

$ps -eL | grep 18975
...
PID   LWP   TTY          TIME CMD
18975 18994 ?        00:00:05 java
18975 19268 ?        00:00:00 java
18975 19269 ?        05:16:49 java
18975 19271 ?        00:01:22 java
18975 19273 ?        00:00:00 java
...

2) jstack -l 18975 says there are no deadlocks, jstack -m 18975 does not work

3) jstack -l 18975 does give me the stack trace for all my threads (~400). Example thread stack (and not the problem):

"http-342.877.573.944-8080-360" daemon prio=10 tid=0x0000002adaba9c00 nid=0x754c in Object.wait() [0x00000000595bc000..0x00000000595bccb0]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on  (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
        at java.lang.Object.wait(Object.java:485)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:416)
        - locked  (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:442)
        at java.lang.Thread.run(Thread.java:619)

4) The ps -eL output's thread ID does not match the output from jstack, or at least I cannot see it. (jstack documentation is a bit sparse.)

5) There are no heavy IO, memory usage or other corresponding activity clues to work with.

Platform:

  • Java 6
  • Tomcat 6
  • RHEL 4 (64-bit)

Does anybody know how I can make that connection from the linux ps output to my problem child java thread? So close, yet so far...


回答1:


It looks like the nid in the jstack output is the Linux LWP id.

"http-342.877.573.944-8080-360" daemon prio=10 tid=0x0000002adaba9c00 nid=0x754c in Object.wait() [0x00000000595bc000..0x00000000595bccb0]

Convert the nid to decimal and you have the LWP id. In your case 0x754c is 30028. This process is not shown in our ps output, but it was probably one of the LWPs you have omitted to save space.

Here's a little a Perl snippet you can use to pipe the output of jstack to:

#!/usr/bin/perl -w
while (<>) {
    if (/nid=(0x[[:xdigit:]]+)/) {
        $lwp = hex($1);
        s/nid=/lwp=$lwp nid=/;
    }
    print;
}



回答2:


You can use JConsole to view the thread's stack trace.

If your using JDK 1.6.0_07 or above, you can also use visualvm.

Both tools provide a nice view of all the running threads in an application. The visualvm is quite a bit nicer, but hopefully seeing all the threads can help you track down the run-away thread.

Check for threads that are always in a state of RUNNING. When we had a run-away thread, the stack trace would constantly change. So we were able to tell which methods the loop was calling, and track down the loop.




回答3:


Nice,useful answers!

For Linux, use ps -efL, -L option will show the LWPs. As a side note, the
"http-342.877.573.944-8080-360" daemon prio=10 means "ThreadName(as given by the JVM)" runningmode(inherited from the pid) priority(inherited from the pid)




回答4:


From memory if you CTRL-BREAK on the console you will get a dump of the current threads and a few of their stack trace frames.

From memory (I'm not sure if this is an IntelliJ IDEa feature, or it is default in java) but it will tell you which thread is deadlocked, and which object they are waiting on. You should be able to redirect the output to a file, and just grep for the DEADLOCKED text.

JConsole, VisualVM or other profilers such as JProfiler will also show you the threads and their stacks, however if you don't want to use any external tool I think CTRL-BREAK will give you what you're looking for.




回答5:


On SUN

Note that prstat by default shows the no of light weight processes , not the LWPID.

To see information for all the lightweight processes for a particular user use the -L option.

prstat -L -v -u weblogic

now use the LWPID and convert it into hex and match it with the nid from the thread dump



来源:https://stackoverflow.com/questions/222108/getting-the-java-thread-id-and-stack-trace-of-run-away-java-thread

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!