I want a solution to validate only domain names not full urls, The following example is what i\'m looking for:
domain.com -> true
domain.net -> true
do
In my case, domain name is considered as valid if the format is stackoverflow.com or xxx.stackoverflow.com
So in addition to other stack answers, I have added checking for www. also.
function isValidDomainName($domain) {
if (filter_var(gethostbyname($domain), FILTER_VALIDATE_IP)) {
return (preg_match('/^www./', $domain)) ? FALSE : TRUE;
}
return FALSE;
}
you can test the function with this code
$domain = array("http://www.domain.com","http://www.domain.com/folder" ,"http://domain.com", "www.domain.com", "domain.com/subfolder", "domain.com","sub.domain.com");
foreach ($domain as $v) {
echo isValidDomainName($v) ? "{$v} is valid<br>" : "{$v} is invalid<br>";
}
How about:
^(?:[-A-Za-z0-9]+\.)+[A-Za-z]{2,6}$
Please try this expression:
^(http[s]?\:\/\/)?((\w+)\.)?(([\w-]+)?)(\.[\w-]+){1,2}$
Remember, regexes can only check to see if something is well formed. "www.idonotexistbecauseiammadeuponthespot.com" is well-formed, but doesn't actually exist... at the time of writing. ;) Furthermore, certain free web hosting providers (like Tripod) allow underscores in subdomains. This is clearly a violation of the RFCs, yet it sometimes works.
Do you want to check if the domain exists? Try dns_get_record instead of (just) a regex.
I made a function to validate the domain name without any regex.
<?php
function validDomain($domain) {
$domain = rtrim($domain, '.');
if (!mb_stripos($domain, '.')) {
return false;
}
$domain = explode('.', $domain);
$allowedChars = array('-');
$extenion = array_pop($domain);
foreach ($domain as $value) {
$fc = mb_substr($value, 0, 1);
$lc = mb_substr($value, -1);
if (
hash_equals($value, '')
|| in_array($fc, $allowedChars)
|| in_array($lc, $allowedChars)
) {
return false;
}
if (!ctype_alnum(str_replace($allowedChars, '', $value))) {
return false;
}
}
if (
!ctype_alnum(str_replace($allowedChars, '', $extenion))
|| hash_equals($extenion, '')
) {
return false;
}
return true;
}
$testCases = array(
'a',
'0',
'a.b',
'google.com',
'news.google.co.uk',
'xn--fsqu00a.xn--0zwm56d',
'google.com ',
'google.com.',
'goo gle.com',
'a.',
'hey.hey',
'google-.com',
'-nj--9*.vom',
' ',
'..',
'google..com',
'www.google.com',
'www.google.com/some/path/to/dir/'
);
foreach ($testCases as $testCase) {
var_dump($testCase);
var_dump(validDomain($TestCase));
echo '<br /><br />';
}
?>
This code outputs:
string(1) "a" bool(false)
string(1) "0" bool(false)
string(3) "a.b" bool(true)
string(10) "google.com" bool(true)
string(17) "news.google.co.uk" bool(true)
string(23) "xn--fsqu00a.xn--0zwm56d" bool(true)
string(11) "google.com " bool(false)
string(11) "google.com." bool(true)
string(11) "goo gle.com" bool(false)
string(2) "a." bool(false)
string(7) "hey.hey" bool(true)
string(11) "google-.com" bool(false)
string(11) "-nj--9*.vom" bool(false)
string(1) " " bool(false)
string(2) ".." bool(false)
string(11) "google..com" bool(false)
string(14) "www.google.com" bool(true)
string(32) "www.google.com/some/path/to/dir/" bool(false)
I hope I have covered everything if I missed something please tell me and I can improve this function. :)
The accepted answer is incomplete/wrong.
The regex pattern;
should NOT validate domains such as:
-domain.com
, domain--.com
, -domain-.-.com
, domain.000
, etc...
should validate domains such as:
schools.k12
, newTLD.clothing
, good.photography
, etc...
After some further research; below is the most correct, cross-language and compact pattern I could come up with:
^(?!\-)(?:(?:[a-zA-Z\d][a-zA-Z\d\-]{0,61})?[a-zA-Z\d]\.){1,126}(?!\d+)[a-zA-Z\d]{1,63}$
This pattern conforms with most* of the rules defined in the specs:
Note 1: The full domain length check is not included in the regex. It should be simply checked by native methods e.g. strlen(domain) <= 253
.
Note 2: This pattern works with most languages including PHP, Javascript, Python, etc...
See DEMO here (for JS, PHP, Python)
The regex above does not support IDNs.
There is no spec that says the extension (TLD) should be between 2 and 6 characters. It actually supports 63 characters. See the current TLD list here. Also, some networks do internally use custom/pseudo TLDs.
Registration authorities might impose some extra, specific rules which are not explicitly supported in this regex. For example, .CO.UK
and .ORG.UK
must have at least 3 characters, but less than 23, not including the extension. These kinds of rules are non-standard and subject to change. Do not implement them if you cannot maintain.
Regular Expressions are great but not the best effective, performant solution to every problem. So a native URL parser should be used instead, whenever possible. e.g. Python's urlparse() method or PHP's parse_url() method...
After all, this is just a format validation. A regex test does not confirm that a domain name is actually configured/exists! You should test the existence by making a request.
UPDATE (2019-12-21): Fixed leading hyphen with subdomains.