I have a function in my nodejs application called get_source_at. It takes a uri as an argument and its purpose is to return the source code from that uri. My problem is that
You should avoid synchronous requests. If you want something like synchronous control flow, you can use async.
async.waterfall([
function(callback){
data = get_source_at(uri);
callback(null, data);
},
function(data,callback){
process(data, callback);
},
], function (err,result) {
console.log(result)
});
The process
is promised to be run after get_source_at
returns.
Ok, first of all, to keep that code asynchronous you can simply place the relevant code inside the callback of the request function meaning it will run after the request finished, but not stop the processor from handling other tasks in your application. If you need it multiple times I would advice you to check out Synchronous request in Node.js which outlines various methods to get this more streamlined and discusses various control flow libraries.
I have to have a way to make sure I have the source code from a uri before continuing the control flow of my application - so if that's not by making the function synchronous, how can it be done?
Given this entry point to your application:
function app(body) {
// Doing lots of rad stuff
}
You kick it off by fetching the body:
request({ uri: uri }, function (error, response, body) {
if(err) return console.error(err);
// Start application
app(body);
}
This is something you will have to get used to when programming for node.js (and javascript in general). There are control flow modules like async (which I, too, recommend) but you have to get used to continuation passing style, as it's called.
This is better way of using deasync.
var request = require("request")
var deasync = require("deasync")
var getHtml = deasync(function (url, cb) {
var userAgent = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36"}
request({
url: url,
headers: userAgent
},
function (err, resp, body) {
if (err) { cb(err, null) }
cb(null, body)
})
})
var title = /<title>(.*?)<\/title>/
var myTitle = getHtml("http://www.yahoo.com").match(title)[1]
console.log(myTitle)
Please refer to documentation of deasync, you will find that you can use
desync(function (n params, cb) {})
to make the function where cb
should come back with (err, data)
. So fs.readFile()
like functions can be easily wrapped with deasync
function. But for functions like request
which don't come back with cb(err, data)
. You can make you own function (named or anonymous) with a custom cb(err, data)
callback format just as I have done in the above code. This way you can force almost any async function perform like sync by waiting for callback cb(err, data)
to come back on a different javascript layer (as the documentation says). Also make sure that you have covered all ways to get out from the function which you are wrapping with deasync with cb(err, data)
callbacks, otherwise your program will block.
Hope, it helps someone out there!
Update:
Don't use this way of doing synchronous requests. Use Async/Await for writting promises based synchronous looking code. You can use request-promise-native
npm module to avoid wrapping requests module with promises yourself.
Having a simple blocking function is a great boon for interactive development! The sync
function (defined below) can synchronize any promise, cutting down dramatically on the amount of syntax needed to play with an API and learn it. For example, here's how to use it with the puppeteer library for headless Chrome:
var browser = sync(puppeteer.connect({ browserWSEndpoint: "ws://some-endpoint"}));
var pages = sync(browser.pages())
pages.length
1
var page = pages[0]
sync(page.goto('https://duckduckgo.com', {waitUntil: 'networkidle2'}))
sync(page.pdf({path: 'webpage.pdf', format: 'A4'}))
The best part is, each one of these lines can be tweaked until it does what you want, without having to re-run or re-type all of the previous lines each time you want to test it. This works because you have direct access to the browser
and pages
variables from the top-level.
Here's how it works:
const deasync = require("deasync");
const sync = deasync((promise, callback) => promise.then(result) => callback(null, result)));
It uses the deasync package mentioned in other answers. deasync
creates a partial application to the anonymous function, which adds callback
as the last argument, and blocks until callback
has been called. callback
receives the error condition as its first argument (if any), and the result as its second (if any).
You can with deasync:
function get_source_at(uri){
var source;
request({ uri:uri}, function (error, response, body) {
source = body;
console.log(body);
});
while(source === undefined) {
require('deasync').runLoopOnce();
}
return source;
}