I would like to match just the root of a URL and not the whole URL from a text string. Given:
http://www.youtube.co
If you end up on this page and you are looking for the best REGEX of URLS try this one:
^(?:https?:)?(?:\/\/)?([^\/\?]+)
https://regex101.com/r/pX5dL9/1
It works for urls without http:// , with http, with https, with just // and dont grab the path and query path as well.
Good Luck
Was looking for a solution to this problem today. None of the above answers seemed to satisfy. I wanted a solution that could be a one liner, no conditional logic and nothing that had to be wrapped in a function.
Here's what I came up with, seems to work really well:
hostname="http://www.example.com:1234" hostname.split("//").slice(-1)[0].split(":")[0].split('.').slice(-2).join('.') // gives "example.com"
May look complicated at first glance, but it works pretty simply; the key is using 'slice(-n)' in a couple of places where the good part has to be pulled from the end of the split array (and [0] to get from the front of the split array).
Each of these tests return "example.com":
"http://example.com".split("//").slice(-1)[0].split(":")[0].split('.').slice(-2).join('.') "http://example.com:1234".split("//").slice(-1)[0].split(":")[0].split('.').slice(-2).join('.') "http://www.example.com:1234".split("//").slice(-1)[0].split(":")[0].split('.').slice(-2).join('.') "http://foo.www.example.com:1234".split("//").slice(-1)[0].split(":")[0].split('.').slice(-2).join('.')
There is no need to parse the string, just pass your URL as an argument to URL constructor:
var url = 'http://www.youtube.com/watch?v=ClkQA2Lb_iE';
var hostname = (new URL(url)).hostname;
assert(hostname === 'www.youtube.com');
I tried to use the Given solutions, the Chosen one was an overkill for my purpose and "Creating a element" one messes up for me.
It's not ready for Port in URL yet. I hope someone finds it useful
function parseURL(url){
parsed_url = {}
if ( url == null || url.length == 0 )
return parsed_url;
protocol_i = url.indexOf('://');
parsed_url.protocol = url.substr(0,protocol_i);
remaining_url = url.substr(protocol_i + 3, url.length);
domain_i = remaining_url.indexOf('/');
domain_i = domain_i == -1 ? remaining_url.length - 1 : domain_i;
parsed_url.domain = remaining_url.substr(0, domain_i);
parsed_url.path = domain_i == -1 || domain_i + 1 == remaining_url.length ? null : remaining_url.substr(domain_i + 1, remaining_url.length);
domain_parts = parsed_url.domain.split('.');
switch ( domain_parts.length ){
case 2:
parsed_url.subdomain = null;
parsed_url.host = domain_parts[0];
parsed_url.tld = domain_parts[1];
break;
case 3:
parsed_url.subdomain = domain_parts[0];
parsed_url.host = domain_parts[1];
parsed_url.tld = domain_parts[2];
break;
case 4:
parsed_url.subdomain = domain_parts[0];
parsed_url.host = domain_parts[1];
parsed_url.tld = domain_parts[2] + '.' + domain_parts[3];
break;
}
parsed_url.parent_domain = parsed_url.host + '.' + parsed_url.tld;
return parsed_url;
}
Running this:
parseURL('https://www.facebook.com/100003379429021_356001651189146');
Result:
Object {
domain : "www.facebook.com",
host : "facebook",
path : "100003379429021_356001651189146",
protocol : "https",
subdomain : "www",
tld : "com"
}
Here's the jQuery one-liner:
$('<a>').attr('href', url).prop('hostname');
import URL from 'url';
const pathname = URL.parse(url).path;
console.log(url.replace(pathname, ''));
this takes care of both the protocol.