How do I check to see if a URL exists without pulling it down? I use the following code, but it downloads the whole file. I just need to check that it exists.
ap
Thanks! Here it is, encapsulated in a function (updated on 5/30/17 with the require outside):
var http = require('http'),
url = require('url');
exports.checkUrlExists = function (Url, callback) {
var options = {
method: 'HEAD',
host: url.parse(Url).host,
port: 80,
path: url.parse(Url).pathname
};
var req = http.request(options, function (r) {
callback( r.statusCode== 200);});
req.end();
}
It's very quick (I get about 50 ms, but it will depend on your connection and the server speed). Note that it's also quite basic, i.e. it won't handle redirects very well...
I see in your code that you are already using the request
library, so just:
const request = require('request');
request.head('http://...', (error, res) => {
const exists = !error && res.statusCode === 200;
});
Currently request
module is being deprecated as @schlicki pointed out. One of the alternatives in the link he posted is got:
const got = require('got');
(async () => {
try {
const response = await got('https://www.nodesource.com/');
console.log(response.body);
//=> '<!doctype html> ...'
} catch (error) {
console.log(error.response.body);
//=> 'Internal server error ...'
}
})();
But with this method, you will get the whole HTML page in the reponse.body
. In addition got
may have many more functionalities you may not need. That's I wanted to add another alternative I found to the list. As I was using the portscanner library, I could use it for the same aim without downloading the content of the website. You may need to use the 443 port as well if the website works with https
var portscanner = require('portscanner')
// Checks the status of a single port
portscanner.checkPortStatus(80, 'www.google.es', function(error, status) {
// Status is 'open' if currently in use or 'closed' if available
console.log(status)
})
Anyway, the most close approach is url-exist
module as @Richie Bendall explains in his post. I just wanted to add some other alternative
my awaitable async ES6 solution, doing a HEAD request:
// options for the http request
let options = {
host: 'google.de',
//port: 80, optional
//path: '/' optional
}
const http = require('http');
// creating a promise (all promises a can be awaited)
let isOk = await new Promise(resolve => {
// trigger the request ('HEAD' or 'GET' - you should check if you get the expected result for a HEAD request first (curl))
// then trigger the callback
http.request({method:'HEAD', host:options.host, port:options.port, path: options.path}, result =>
resolve(result.statusCode >= 200 && result.statusCode < 400)
).on('error', resolve).end();
});
// check if the result was NOT ok
if (!isOk)
console.error('could not get: ' + options.host);
else
console.info('url exists: ' + options.host);
If you're using axios, you can fetch the head like:
const checkUrl = async (url) => {
try {
await axios.head(fullUrl);
return true;
} catch (error) {
if (error.response.status >= 400) {
return false;
}
}
}
You may want to customise the status code range for your requirements e.g. 401 (Unauthorized) could still mean a URL exists but you don't have access.