I \'borrowed\' a regex from this website : http://daringfireball.net/2010/07/improved_regex_for_matching_urls that is almost complete but i want to match exemple.com
I know
Please check if
var reg=/\b((?:[a-z][\w-]+:(?:\/*)|(?:www\d{0,3}[.])|[a-z0-9.\-]+[.][a-z]{2,4}\/{0,1})(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))*(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi;
suits your needs. www(anyNumber) has just been put to appear one or zero times. Sorry for the first answer, did not notice the texts.
Add an alternation operator (|
) after the {2,4}\/
, i.e.
var reg=/\b((?:[a-z][\w-]+:(?:\/*)|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/|)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi;
There's something you should understand about this. The first non-captured group, (?: … )
, looks for "indicators" of URLs. One indicator, for example, is the www
(followed by up to 3 digits of numbers). You however are asking for a way to identify URLs without any indicator at all. So, what we've done above is we've added a clause, "or an empty match," as a "valid" indicator. The consequence of this is that your regular expression is less selective now: all sorts of strings, not only example.com but also filename.txt, 3.141593, and omg...really are identified as URLs! Your only other (readily available) option is to be more selective about suffixes, e.g. require specific suffixes (com|org|net
), but then this takes away from the generality of the original regex, which doesn't specify any suffixes at all.
In other words, you are probably faced with a limitation of logic, not a limitation of regex-writing skills or the regex language itself.