For some days I have searched for a working solution to an error
Error: EMFILE, too many open files
It seems that many people have the same problem. The usual answer involves increasing the number of file descriptors. So, I've tried this:
sysctl -w kern.maxfiles=20480
,
The default value is 10240. This is a little strange in my eyes, because the number of files I'm handling in the directory is under 10240. Even stranger, I still receive the same error after I've increased the number of file descriptors.
Second question:
After a number of searches I found a work around for the "too many open files" problem:
var requestBatches = {};
function batchingReadFile(filename, callback) {
// First check to see if there is already a batch
if (requestBatches.hasOwnProperty(filename)) {
requestBatches[filename].push(callback);
return;
}
// Otherwise start a new one and make a real request
var batch = requestBatches[filename] = [callback];
FS.readFile(filename, onRealRead);
// Flush out the batch on complete
function onRealRead() {
delete requestBatches[filename];
for (var i = 0, l = batch.length; i < l; i++) {
batch[i].apply(null, arguments);
}
}
}
function printFile(file){
console.log(file);
}
dir = "/Users/xaver/Downloads/xaver/xxx/xxx/"
var files = fs.readdirSync(dir);
for (i in files){
filename = dir + files[i];
console.log(filename);
batchingReadFile(filename, printFile);
Unfortunately I still recieve the same error. What is wrong with this code?
One last question (I'm new to javascript and node), I'm in the process of developping a web application with a lot of requests for about 5000 daily users. I've many years of experience in programming with other languages like python and java. so originally I thought to developp this application with django or play framework. Then I discovered node and I must say that the idea of non-blocking I/O model is really nice, seductive, and most of all very fast!
But what kind of problems should I expect with node? Is it a production proven web server? What are your experiences?
For when graceful-fs doesn't work... or you just want to understand where the leak is coming from. Follow this process.
(e.g. graceful-fs isn't gonna fix your wagon if your issue is with sockets.)
From My Blog Article: http://www.blakerobertson.com/devlog/2014/1/11/how-to-determine-whats-causing-error-connect-emfile-nodejs.html
How To Isolate
This command will output the number of open handles for nodejs processes:
lsof -i -n -P | grep nodejs
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
nodejs 12211 root 1012u IPv4 151317015 0t0 TCP 10.101.42.209:40371->54.236.3.170:80 (ESTABLISHED)
nodejs 12211 root 1013u IPv4 151279902 0t0 TCP 10.101.42.209:43656->54.236.3.172:80 (ESTABLISHED)
nodejs 12211 root 1014u IPv4 151317016 0t0 TCP 10.101.42.209:34450->54.236.3.168:80 (ESTABLISHED)
nodejs 12211 root 1015u IPv4 151289728 0t0 TCP 10.101.42.209:52691->54.236.3.173:80 (ESTABLISHED)
nodejs 12211 root 1016u IPv4 151305607 0t0 TCP 10.101.42.209:47707->54.236.3.172:80 (ESTABLISHED)
nodejs 12211 root 1017u IPv4 151289730 0t0 TCP 10.101.42.209:45423->54.236.3.171:80 (ESTABLISHED)
nodejs 12211 root 1018u IPv4 151289731 0t0 TCP 10.101.42.209:36090->54.236.3.170:80 (ESTABLISHED)
nodejs 12211 root 1019u IPv4 151314874 0t0 TCP 10.101.42.209:49176->54.236.3.172:80 (ESTABLISHED)
nodejs 12211 root 1020u IPv4 151289768 0t0 TCP 10.101.42.209:45427->54.236.3.171:80 (ESTABLISHED)
nodejs 12211 root 1021u IPv4 151289769 0t0 TCP 10.101.42.209:36094->54.236.3.170:80 (ESTABLISHED)
nodejs 12211 root 1022u IPv4 151279903 0t0 TCP 10.101.42.209:43836->54.236.3.171:80 (ESTABLISHED)
nodejs 12211 root 1023u IPv4 151281403 0t0 TCP 10.101.42.209:43930->54.236.3.172:80 (ESTABLISHED)
....
Notice the: 1023u (last line) - that's the 1024th file handle which is the default maximum.
Now, Look at the last column. That indicates which resource is open. You'll probably see a number of lines all with the same resource name. Hopefully, that now tells you where to look in your code for the leak.
If you don't know multiple node processes, first lookup which process has pid 12211. That'll tell you the process.
In my case above, I noticed that there were a bunch of very similar IP Addresses. They were all 54.236.3.###
By doing ip address lookups, was able to determine in my case it was pubnub related.
Command Reference
Use this syntax to determine how many open handles a process has open...
To get a count of open files for a certain pid
I used this command to test the number of files that were opened after doing various events in my app.
lsof -i -n -P | grep "8465" | wc -l
# lsof -i -n -P | grep "nodejs.*8465" | wc -l
28
# lsof -i -n -P | grep "nodejs.*8465" | wc -l
31
# lsof -i -n -P | grep "nodejs.*8465" | wc -l
34
What is your process limit?
ulimit -a
The line you want will look like this:
open files (-n) 1024
Permanently change the limit:
- tested on Ubuntu 14.04, nodejs v. 7.9
In case if you are expecting to open many connections (websockets is a good example), you can permanently increase the limit:
file: /etc/pam.d/common-session (add to the end)
session required pam_limits.so
file: /etc/security/limits.conf (add to the end, or edit if already exists)
root soft nofile 40000 root hard nofile 100000
restart your nodejs and logout/login from ssh.
- this may not work for older NodeJS you'll need to restart server
- use instead of if your node runs with different uid.
Using the graceful-fs
module by Isaac Schlueter (node.js maintainer) is probably the most appropriate solution. It does incremental back-off if EMFILE is encountered. It can be used as a drop-in replacement for the built-in fs
module.
I ran into this problem today, and finding no good solutions for it, I created a module to address it. I was inspired by @fbartho's snippet, but wanted to avoid overwriting the fs module.
The module I wrote is Filequeue, and you use it just like fs:
var Filequeue = require('filequeue');
var fq = new Filequeue(200); // max number of files to open at once
fq.readdir('/Users/xaver/Downloads/xaver/xxx/xxx/', function(err, files) {
if(err) {
throw err;
}
files.forEach(function(file) {
fq.readFile('/Users/xaver/Downloads/xaver/xxx/xxx/' + file, function(err, data) {
// do something here
}
});
});
You're reading too many files. Node reads files asynchronously, it'll be reading all files at once. So you're probably reading the 10240 limit.
See if this works:
var fs = require('fs')
var events = require('events')
var util = require('util')
var path = require('path')
var FsPool = module.exports = function(dir) {
events.EventEmitter.call(this)
this.dir = dir;
this.files = [];
this.active = [];
this.threads = 1;
this.on('run', this.runQuta.bind(this))
};
// So will act like an event emitter
util.inherits(FsPool, events.EventEmitter);
FsPool.prototype.runQuta = function() {
if(this.files.length === 0 && this.active.length === 0) {
return this.emit('done');
}
if(this.active.length < this.threads) {
var name = this.files.shift()
this.active.push(name)
var fileName = path.join(this.dir, name);
var self = this;
fs.stat(fileName, function(err, stats) {
if(err)
throw err;
if(stats.isFile()) {
fs.readFile(fileName, function(err, data) {
if(err)
throw err;
self.active.splice(self.active.indexOf(name), 1)
self.emit('file', name, data);
self.emit('run');
});
} else {
self.active.splice(self.active.indexOf(name), 1)
self.emit('dir', name);
self.emit('run');
}
});
}
return this
};
FsPool.prototype.init = function() {
var dir = this.dir;
var self = this;
fs.readdir(dir, function(err, files) {
if(err)
throw err;
self.files = files
self.emit('run');
})
return this
};
var fsPool = new FsPool(__dirname)
fsPool.on('file', function(fileName, fileData) {
console.log('file name: ' + fileName)
console.log('file data: ', fileData.toString('utf8'))
})
fsPool.on('dir', function(dirName) {
console.log('dir name: ' + dirName)
})
fsPool.on('done', function() {
console.log('done')
});
fsPool.init()
I just finished writing a little snippet of code to solve this problem myself, all of the other solutions appear way too heavyweight and require you to change your program structure.
This solution just stalls any fs.readFile or fs.writeFile calls so that there are no more than a set number in flight at any given time.
// Queuing reads and writes, so your nodejs script doesn't overwhelm system limits catastrophically
global.maxFilesInFlight = 100; // Set this value to some number safeish for your system
var origRead = fs.readFile;
var origWrite = fs.writeFile;
var activeCount = 0;
var pending = [];
var wrapCallback = function(cb){
return function(){
activeCount--;
cb.apply(this,Array.prototype.slice.call(arguments));
if (activeCount < global.maxFilesInFlight && pending.length){
console.log("Processing Pending read/write");
pending.shift()();
}
};
};
fs.readFile = function(){
var args = Array.prototype.slice.call(arguments);
if (activeCount < global.maxFilesInFlight){
if (args[1] instanceof Function){
args[1] = wrapCallback(args[1]);
} else if (args[2] instanceof Function) {
args[2] = wrapCallback(args[2]);
}
activeCount++;
origRead.apply(fs,args);
} else {
console.log("Delaying read:",args[0]);
pending.push(function(){
fs.readFile.apply(fs,args);
});
}
};
fs.writeFile = function(){
var args = Array.prototype.slice.call(arguments);
if (activeCount < global.maxFilesInFlight){
if (args[1] instanceof Function){
args[1] = wrapCallback(args[1]);
} else if (args[2] instanceof Function) {
args[2] = wrapCallback(args[2]);
}
activeCount++;
origWrite.apply(fs,args);
} else {
console.log("Delaying write:",args[0]);
pending.push(function(){
fs.writeFile.apply(fs,args);
});
}
};
Like all of us, you are another victim of asynchronous I/O. With asynchronous calls, if you loop around a lot of files, Node.js will start to open a file descriptor for each file to read and then will wait for action until you close it.
File descriptor remains open until resource is available on your server to read it. Even if your files are small and reading or updating is fast, it takes some time, but in the same time your loop don't stop to open new files descriptor. So if you have too many files, the limit will be soon reached and you get a beautiful EMFILE.
There is one solution, creating a queue to avoid this effect.
Thanks to people who wrote Async, there is a very useful function for that. There is a method called Async.queue, you create a new queue with a limit and then add filenames to the queue.
Note: If you have to open many files, it would be a good idea to store which files are currently open and don't reopen them infinitely.
const fs = require('fs')
const async = require("async")
var q = async.queue(function(task, callback) {
console.log(task.filename);
fs.readFile(task.filename,"utf-8",function (err, data_read) {
callback(err,task.filename,data_read);
}
);
}, 4);
var files = [1,2,3,4,5,6,7,8,9,10]
for (var file in files) {
q.push({filename:file+".txt"}, function (err,filename,res) {
console.log(filename + " read");
});
}
You can see that each file is added to the queue (console.log filename), but only when the current queue is under the limit you set previously.
async.queue get information about availability of the queue through a callback, this callback is called only when data file is read and any action you have to do is achieved. (see fileRead method)
So you cannot be overwhelmed by files descriptor.
> node ./queue.js
0.txt
1.txt
2.txt
0.txt read
3.txt
3.txt read
4.txt
2.txt read
5.txt
4.txt read
6.txt
5.txt read
7.txt
1.txt read (biggest file than other)
8.txt
6.txt read
9.txt
7.txt read
8.txt read
9.txt read
I am not sure whether this will help anyone, I started working on a big project with lot of dependencies which threw me the same error. My colleague suggested me to install watchman
using brew and that fixed this problem for me.
brew update
brew install watchman
Edit on 26 June 2019: Github link to watchman
With bagpipe, you just need change
FS.readFile(filename, onRealRead);
=>
var bagpipe = new Bagpipe(10);
bagpipe.push(FS.readFile, filename, onRealRead))
The bagpipe help you limit the parallel. more details: https://github.com/JacksonTian/bagpipe
Had the same problem when running the nodemon command so i reduced the name of files open in sublime text and the error dissappeared.
cwait is a general solution for limiting concurrent executions of any functions that return promises.
In your case the code could be something like:
var Promise = require('bluebird');
var cwait = require('cwait');
// Allow max. 10 concurrent file reads.
var queue = new cwait.TaskQueue(Promise, 10);
var read = queue.wrap(Promise.promisify(batchingReadFile));
Promise.map(files, function(filename) {
console.log(filename);
return(read(filename));
})
Building on @blak3r's answer, here's a bit of shorthand I use in case it helps other diagnose:
If you're trying to debug a Node.js script that is running out of file descriptors here's a line to give you the output of lsof
used by the node process in question:
openFiles = child_process.execSync(`lsof -p ${process.pid}`);
This will synchronously run lsof
filtered by the current running Node.js process and return the results via buffer.
Then use console.log(openFiles.toString())
to convert the buffer to a string and log the results.
来源:https://stackoverflow.com/questions/8965606/node-and-error-emfile-too-many-open-files