问题
I have a ZombieJS node server on Heroku scrapping data from the internet. The server code is called from a for
loop on the client side. Each iteration of the loop makes a server call which makes a Zombie scrape. Sometimes, the server will crash with the error below. It only happens when there is more than one iteration of the for
loop.
How can I make the code robust enough to handle multiple simultaneous client calls, each with a for
loop.
Code:
var express = require('express');
var app = express();
var Browser = require('zombie'); // tried changing var to const; no difference
var assert = require('assert');
app.set('port', (process.env.PORT || 5000));
var printMessage = function() { console.log("Node app running on " + app.get('port')); };
var getAbc = function(response, input)
{
var browser = new Browser();
browser.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0';
browser.runScripts = true;
var url = "http://www.google.com/ncr";
browser.visit(url, function() {
browser.fill('q', input).pressButton('Google Search', function(){
// parsing number of results from browser object
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end(numberOfSearchResults);
});
});
}
var handleXyz = function(request, response)
{
getAbc(response, request.query.input);
}
app.listen(app.get('port'), printMessage);
app.post('/xyz', handleXyz);
Error:
assert.js:86
throw new assert.AssertionError({
^
No open window with an HTML document
at Browser.field (/app/node_modules/zombie/lib/index.js:811:7)
at Browser.fill (/app/node_modules/zombie/lib/index.js:903:24)
at /app/cfv1.js:42:11
at done (/app/node_modules/zombie/lib/eventloop.js:589:9)
at timeout (/app/node_modules/zombie/lib/eventloop.js:594:33)
at Timer.listOnTimeout (timers.js:119:15)
I have a similar project using HorsemanJS/PhantomJS which fails in a similar way (I'm stuck on that too!): NodeJS server can't handle multiple users
回答1:
In general, I think you should be careful or just avoid generating a lot of unsolicited requests to remote servers. Many sites will throttle you and/or start rejecting connections. With that said, I believe I found the source of the issue in this particular case.
I tested the code snippet and for this particular case, Google will reset the connection if you make too many requests. When the connection is reset, one of the variables ends up failing an assertion.
The error I get when the connection is reset:
zombie TypeError: read ECONNRESET
at zombie/lib/pipeline.js:89:15
at tryCatcher (zombie/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (zombie/node_modules/bluebird/js/release/promise.js:497:31)
at Promise._settlePromise (zombie/node_modules/bluebird/js/release/promise.js:555:18)
at Promise._settlePromise0 (zombie/node_modules/bluebird/js/release/promise.js:600:10)
at Promise._settlePromises (zombie/node_modules/bluebird/js/release/promise.js:679:18)
at Async._drainQueue (zombie/node_modules/bluebird/js/release/async.js:125:16)
at Async._drainQueues (zombie/node_modules/bluebird/js/release/async.js:135:10)
at Immediate.Async.drainQueues [as _onImmediate] (zombie/node_modules/bluebird/js/release/async.js:16:14)
at processImmediate [as _immediateCallback] (timers.js:383:17)
I get your original error further down, but the source of the problem is actually because of the above. When the above happens, it causes document.documentElement to be a false-y value and subsequently causes this assertion in zombie/lib/index.js in the field function to fail:
assert(this.document && this.document.documentElement, 'No open window with an HTML document');
I think the easiest solution is to handle the error on the client end and try to recover gracefully.
回答2:
I see you are making a new instance of the Browser object for each call. My guess is the previous "Browser" is still closing, or hasn't been handled by the garbage collector when the next call is trying to open another. Try moving the instantiation of the Browser to outside of getAbc()
来源:https://stackoverflow.com/questions/35563187/zombiejs-intermittently-crashes-when-called-repeatedly-from-a-for-loop