Our server hung because of a SIGSEGV fault..
A fatal error has been detected by the Java Runtime Environment:
SIGSEGV (0xb) at pc=0x00007ff5c7195aa
There is one catchy situation in JNI code: when such a code blocks SIGSEGV signal e.g. because it blocks all signals (quite common approach in threaded C code how to ensure that only main thread will process signals) AND it calls 'back' Java VM (aka callback) then it can result in quite random SIGSEGV-triggered aborting of the process.
And there is virtually nothing wrong - SIGSEGV is actually triggered by Java VM in order to detect certain conditions in memory (it acts as memory barrier … etc) and it expects that such a signal will be handled by Java VM. Unfortunately when SIGSEGV is blocked, then 'standard' SIGSEGV reaction is triggered => VM process crashes.
Signal Description
SIGSEGV, SIGBUS, SIGFPE, SIGPIPE, SIGILL -- Used in the implementation for implicit null check, and so forth.
SIGQUIT Thread dump support -- To dump Java stack traces at the standard error stream. (Optional.)
SIGTERM, SIGINT, SIGHUP -- Used to support the shutdown hook mechanism (java.lang.Runtime.addShutdownHook) when the VM is terminated abnormally. (Optional.)
SIGUSR1 -- Used in the implementation of the java.lang.Thread.interrupt method. (Configurable.) Not used starting with Solaris 10 OS. Reserved on Linux. SIGUSR2 Used internally. (Configurable.) Not used starting with Solaris 10 OS. SIGABRT The HotSpot VM does not handle this signal. Instead it calls the abort function after fatal error handling. If an application uses this signal then it should terminate the process to preserve the expected semantics.
The fatal error log indicates that the crash was in a native library, there might be a bug in native code or JNI library code. The crash could of course be caused by something else, but analysis of the library and any core file or crash dump is a good starting place.
In this case a SIGSEGV occurred with a thread executing in the library libdtagentcore.so . In some cases a bug in a native library manifests itself as a crash in Java VM code. Consider the following crash where a JavaThread fails while in the _thread_in_vm state (meaning that it is executing in Java VM code)
It is telling you that an error occurred in code loaded from libdtagentcore.so
. More specifically it happened in function named restrict
and at offset 0x506f6
. The first offset mentioned (0xb7aaa
) is offset within the library itself. If it was build with debugging symbols (-g) you can look at the code that caused the exception, on Linux something along the lines of:
addr2line -e libdtagentcore.so -C -f 0xb7aaa
In case this is read by someone on Windows, see https://community.oracle.com/blogs/kohsuke/2009/02/19/crash-course-jvm-crash-analysis
More details in https://www.youtube.com/watch?v=jd6dJa7tSNU