问题
I'm trying to debug a file descriptor leak in a Java webapp running in Jetty 7.0.1 on Linux.
The app had been happily running for a month or so when requests started to fail due to too many open files, and Jetty had to be restarted.
java.io.IOException: Cannot run program [external program]: java.io.IOException: error=24, Too many open files
at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
at java.lang.Runtime.exec(Runtime.java:593)
at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:58)
at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:246)
At first I thought the issue was with the code that launches the external program, but it's using commons-exec and I don't see anything wrong with it:
CommandLine command = new CommandLine("/path/to/command")
.addArgument("...");
ByteArrayOutputStream errorBuffer = new ByteArrayOutputStream();
Executor executor = new DefaultExecutor();
executor.setWatchdog(new ExecuteWatchdog(PROCESS_TIMEOUT));
executor.setStreamHandler(new PumpStreamHandler(null, errorBuffer));
try {
executor.execute(command);
} catch (ExecuteException executeException) {
if (executeException.getExitValue() == EXIT_CODE_TIMEOUT) {
throw new MyCommandException("timeout");
} else {
throw new MyCommandException(errorBuffer.toString("UTF-8"));
}
}
Listing open files on the server I can see a high number of FIFOs:
# lsof -u jetty
...
java 524 jetty 218w FIFO 0,6 0t0 19404236 pipe
java 524 jetty 219r FIFO 0,6 0t0 19404008 pipe
java 524 jetty 220r FIFO 0,6 0t0 19404237 pipe
java 524 jetty 222r FIFO 0,6 0t0 19404238 pipe
when Jetty starts there are just 10 FIFOs, after a few days there are hundreds of them.
I know it's a bit vague at this stage, but do you have any suggestions on where to look next, or how to get more detailed info about those file descriptors?
回答1:
Your external program does not behave properly. Have a look at why it doesn't do that.
回答2:
The problem comes from your Java application (or a library you are using).
First, you should read the entire outputs (Google for StreamGobbler), and pronto!
Javadoc says:
The parent process uses these streams to feed input to and get output from the subprocess. Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the subprocess may cause the subprocess to block, and even deadlock.
Secondly, waitFor() your process to terminate. You then should close the input, output and error streams.
Finally destroy() your Process.
My sources:
- http://stuffthathappens.com/blog/2007/11/28/crash-boom-too-many-open-files/
- http://www.javaworld.com/javaworld/jw-12-2000/jw-1229-traps.html?page=4
- http://kylecartmell.com/?p=9
回答3:
As you are running on Linux I suspect you are running out of file descriptors. Check out ulimit. Here is an article that describes the problem: http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
回答4:
Don't know the nature of your app, but I have seen this error manifested multiple times because of a connection pool leak, so that would be worth checking out. On Linux, socket connections consume file descriptors as well as file system files. Just a thought.
回答5:
Aside from looking into root cause issues like file leaks, etc. in order to do a legitimate increase the "open files" limit and have that persist across reboots, consider editing
/etc/security/limits.conf
by adding something like this
jetty soft nofile 2048
jetty hard nofile 4096
where "jetty" is the username in this case. For more details on limits.conf, see http://linux.die.net/man/5/limits.conf
log off and then log in again and run
ulimit -n
to verify that the change has taken place. New processes by this user should now comply with this change. This link seems to describe how to apply the limit on already running processes but I have not tried it.
The default limit 1024 can be too low for large Java applications.
回答6:
You can handle the fds yourself. The exec in java returns a Process object. Intermittently check if the process is still running. Once it has completed close the processes STDERR, STDIN, and STDOUT streams (e.g. proc.getErrorStream.close()). That will mitigate the leaks.
回答7:
This problem comes when you are writing data in many files simultaneously and your Operating System has a fixed limit of Open files. In Linux, you can increase the limit of open files.
https://www.tecmint.com/increase-set-open-file-limits-in-linux/
How do I change the number of open files limit in Linux?
来源:https://stackoverflow.com/questions/2044672/ioexception-too-many-open-files