I have a stream that I process by listening for the data
,error
, and end
events, and I call a function to process each data
event in the first stream. Naturally, the function processing the data calls other callbacks, making it asynchronous. So how do I start executing more code when the data in the stream is processed? Listening for the end
event in the stream does NOT mean the asynchronous data
processing functions have finished.
How can I ensure that the stream data processing functions are finished when I execute my next statement?
Here is an example:
function updateAccountStream (accountStream, callThisOnlyAfterAllAccountsAreMigrated) {
var self = this;
var promises = [];
accountStream
.on('data', function (account) {
migrateAccount.bind(self)(account, finishMigration);
})
.on('error', function (err) {
return console.log(err);
})
.on('end', function () {
console.log("Finished updating account stream (but finishMigration is still running!!!)");
callThisOnlyAfterAllAccountsAreMigrated() // finishMigration is still running!
});
}
var migrateAccount = function (oldAccount, callback) {
executeSomeAction(oldAccount, function(err, newAccount) {
if (err) return console.log("error received:", err);
return callback(newAccount);
});
}
var finishMigration = function (newAccount) {
// some code that is executed asynchronously...
}
How do I ensure that callThisOnlyAfterAllAccountsAreMigrated
is called AFTER the stream has been processed?
Can this be done with promises? Can it be done with through streams? I am working with Nodejs, so referencing other npm modules could be helpful.
As you said, listening for the end
event on the stream is useless on its own. The stream doesn't know or care what you're doing with the data in your data
handler, so you would need to write some code to keep track of your own migrateAccount state.
If it were me, I would rewrite this whole section. If you use the readable
event with .read()
on your stream, you can read as many items at a time as you feel like dealing with. If that's one, no problem. If it's 30, great. The reason you do this is so that you won't overrun whatever is doing work with the data coming from the stream. As-is right now, if accountStream is fast, your application will undoubtedly crash at some point.
When you read an item from a stream and start work, take the promise you get back (use Bluebird or similar) and throw it into an array. When the promise is resolved, remove it from the array. When the stream ends, attach a .done()
handler to .all()
(basically making one big promise out of every promise still in the array).
You could also use a simple counter for jobs in progress.
Using a through stream (the npm through2 module), I solved this problem using the following code that controls the asynchronous behaviour:
var through = require('through2').obj;
function updateAccountStream (accountStream, callThisOnlyAfterAllAccountsAreMigrated) {
var self = this;
var promises = [];
accountStream.pipe(through(function(account, _, next) {
migrateAccount.bind(self)(account, finishMigration, next);
}))
.on('data', function (account) {
})
.on('error', function (err) {
return console.log(err);
})
.on('end', function () {
console.log("Finished updating account stream");
callThisOnlyAfterAllAccountsAreMigrated();
});
}
var migrateAccount = function (oldAccount, callback, next) {
executeSomeAction(oldAccount, function(err, newAccount) {
if (err) return console.log("error received:", err);
return callback(newAccount, next);
});
}
var finishMigration = function (newAccount, next) {
// some code that is executed asynchronously, but using 'next' callback when migration is finished...
}
It is a lot easier when you handle streams via promises.
Copied from here, an example that uses spex library:
var spex = require('spex')(Promise);
var fs = require('fs');
var rs = fs.createReadStream('values.txt');
function receiver(index, data, delay) {
return new Promise(function (resolve) {
console.log("RECEIVED:", index, data, delay);
resolve(); // ok to read the next data;
});
}
spex.stream.read(rs, receiver)
.then(function (data) {
// streaming successfully finished;
console.log("DATA:", data);
}, function (reason) {
// streaming has failed;
console.log("REASON:", reason);
});
来源:https://stackoverflow.com/questions/30465092/how-to-ensure-asynchronous-code-is-executed-after-a-stream-is-finished-processin