Is it possible without using regular expression?
For example, I want to check that a string is a valid domain:
domain-name
abcd
example
Are valid domains. These are invalid of course:
domaia@name
ab$%cd
And so on. So basically it should start with an alphanumeric character, then there may be more alnum characters plus also a hyphen. And it must end with an alnum character, too.
If it's not possible, could you suggest me a regexp pattern to do this?
EDIT:
Why doesn't this work? Am I using preg_match incorrectly?
$domain = '@djkal';
$regexp = '/^[a-zA-Z0-9][a-zA-Z0-9\-\_]+[a-zA-Z0-9]$/';
if (false === preg_match($regexp, $domain)) {
throw new Exception('Domain invalid');
}
<?php
function is_valid_domain_name($domain_name)
{
return (preg_match("/^([a-z\d](-*[a-z\d])*)(\.([a-z\d](-*[a-z\d])*))*$/i", $domain_name) //valid chars check
&& preg_match("/^.{1,253}$/", $domain_name) //overall length check
&& preg_match("/^[^\.]{1,63}(\.[^\.]{1,63})*$/", $domain_name) ); //length of each label
}
?>
Test cases:
is_valid_domain_name? [a] Y
is_valid_domain_name? [0] Y
is_valid_domain_name? [a.b] Y
is_valid_domain_name? [localhost] Y
is_valid_domain_name? [google.com] Y
is_valid_domain_name? [news.google.co.uk] Y
is_valid_domain_name? [xn--fsqu00a.xn--0zwm56d] Y
is_valid_domain_name? [goo gle.com] N
is_valid_domain_name? [google..com] N
is_valid_domain_name? [google.com ] N
is_valid_domain_name? [google-.com] N
is_valid_domain_name? [.google.com] N
is_valid_domain_name? [<script] N
is_valid_domain_name? [alert(] N
is_valid_domain_name? [.] N
is_valid_domain_name? [..] N
is_valid_domain_name? [ ] N
is_valid_domain_name? [-] N
is_valid_domain_name? [] N
With this you will not only be checking if the domain has a valid format, but also if it is active / has an IP address assigned to it.
$domain = "stackoverflow.com";
if(filter_var(gethostbyname($domain), FILTER_VALIDATE_IP))
{
return TRUE;
}
Note that this method requires the DNS entries to be active so if you require a domain string to be validated without being in the DNS use the regular expression method given by velcrow above.
Also this function is not intended to validate a URL string use FILTER_VALIDATE_URL for that. We do not use FILTER_VALIDATE_URL for a domain because a domain string is not a valid URL.
PHP 7
// Validate a domain name
var_dump(filter_var('mandrill._domainkey.mailchimp.com', FILTER_VALIDATE_DOMAIN));
# string(33) "mandrill._domainkey.mailchimp.com"
// Validate an hostname (here, the underscore is invalid)
var_dump(filter_var('mandrill._domainkey.mailchimp.com', FILTER_VALIDATE_DOMAIN, FILTER_FLAG_HOSTNAME));
# bool(false)
It is not documented here: http://www.php.net/filter.filters.validate
and a bug request for this is located here: https://bugs.php.net/bug.php?id=72013
Firstly, you should clarify whether you mean:
- individual domain name labels
- entire domain names (i.e. multiple dot-separate labels)
- host names
The reason the distinction is necessary is that a label can technically include any characters, including the NUL, @
and '.
' characters. DNS is 8-bit capable and it's perfectly possible to have a zone file containing an entry reading "an\0odd\.l@bel
". It's not recommended of course, not least because people would have difficulty telling a dot inside a label from those separating labels, but it is legal.
However, URLs require a host name in them, and those are governed by RFCs 952 and 1123. Valid host names are a subset of domain names. Specifically only letters, digits and hyphen are allowed. Furthermore the first and last characters cannot be a hyphen. RFC 952 didn't permit a number for the first character, but RFC 1123 subsequently relaxed that.
Hence:
a
- valid0
- valida-
- invalida-b
- validxn--dasdkhfsd
- valid (punycode encoding of an IDN)
Off the top of my head I don't think it's possible to invalidate the a-
example with a single simple regexp. The best I can come up with to check a single host label is:
if (preg_match('/^[a-z\d][a-z\d-]{0,62}$/i', $label) &&
!preg_match('/-$/', $label))
{
# label is legal within a hostname
}
To further complicate matters, some domain name entries (typically SRV
records) use labels prefixed with an underscore, e.g. _sip._udp.example.com
. These are not host names, but are legal domain names.
use checkdnsrr http://php.net/manual/en/function.checkdnsrr.php
$domain = "stackoverflow.com";
checkdnsrr($domain , "A");
//returns true if has a dns A record, false otherwise
I think once you have isolated the domain name, say, using Erklan's idea:
$myUrl = "http://www.domain.com/link.php"; $myParsedURL = parse_url($myUrl); $myDomainName= $myParsedURL['host'];
you could use :
if( false === filter_var( $myDomainName, FILTER_VALIDATE_URL ) ) { // failed test }
PHP5s Filter functions are for just such a purpose I would have thought.
It does not strictly answer your question as it does not use Regex, I realise.
Here is another way without regex.
$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];
$ipAddress = gethostbyname($myDomainName);
if($ipAddress == $myDomainName)
{
echo "There is no url";
}
else
{
echo "url found";
}
Regular expression is the most effective way of checking for a domain validation. If you're dead set on not using a Regular Expression (which IMO is stupid), then you could split each part of a domain:
- www. / sub-domain
- domain name
- .extension
You would then have to check each character in some sort of a loop to see that it matches a valid domain.
Like I said, it's much more effective to use a regular expression.
Your regular expression is fine, but you're not using preg_match
right. It returns an int
(0 or 1), not a boolean. Just write if(!preg_match($regex, $string)) { ... }
If you don't want to use regular expressions, you can try this:
$str = 'domain-name';
if (ctype_alnum(str_replace('-', '', $str)) && $str[0] != '-' && $str[strlen($str) - 1] != '-') {
echo "Valid domain\n";
} else {
echo "Invalid domain\n";
}
but as said regexp are the best tool for this.
If you want to check whether a particular domain name or ip address exists or not, you can also use checkdnsrr
Here is the doc http://php.net/manual/en/function.checkdnsrr.php
A valid domain is for me something I'm able to register or at least something that looks like I could register it. This is the reason why I like to separate this from "localhost"-names.
And finally I was interested in the main question if avoiding Regex would be faster and this is my result:
<?php
function filter_hostname($name, $domain_only=false) {
// entire hostname has a maximum of 253 ASCII characters
if (!($len = strlen($name)) || $len > 253
// .example.org and localhost- are not allowed
|| $name[0] == '.' || $name[0] == '-' || $name[ $len - 1 ] == '.' || $name[ $len - 1 ] == '-'
// a.de is the shortest possible domain name and needs one dot
|| ($domain_only && ($len < 4 || strpos($name, '.') === false))
// several combinations are not allowed
|| strpos($name, '..') !== false
|| strpos($name, '.-') !== false
|| strpos($name, '-.') !== false
// only letters, numbers, dot and hypen are allowed
/*
// a little bit slower
|| !ctype_alnum(str_replace(array('-', '.'), '', $name))
*/
|| preg_match('/[^a-z\d.-]/i', $name)
) {
return false;
}
// each label may contain up to 63 characters
$offset = 0;
while (($pos = strpos($name, '.', $offset)) !== false) {
if ($pos - $offset > 63) {
return false;
}
$offset = $pos + 1;
}
return $name;
}
?>
Benchmark results compared with velcrow 's function and 10000 iterations (complete results contains many code variants. It was interesting to find the fastest.):
filter_hostname($domain);// $domains: 0.43556308746338 $real_world: 0.33749794960022
is_valid_domain_name($domain);// $domains: 0.81832790374756 $real_world: 0.32248711585999
$real_world
did not contain extreme long domain names to produce better results. And now I can answer your question: With the usage of ctype_alnum()
it would be possible to realize it without regex, but as preg_match()
was faster I would prefer that.
If you don't like the fact that "local.host" is a valid domain name use this function instead that valids against a public tld list. Maybe someone finds the time to combine both.
The correct answer is that you don't ... you let a unit tested tool do the work for you:
// return '' if host invalid --
private function setHostname($host = '')
{
$ret = (!empty($host)) ? $host : '';
if(filter_var('http://'.$ret.'/', FILTER_VALIDATE_URL) === false) {
$ret = '';
}
return $ret;
}
further reading :https://www.w3schools.com/php/filter_validate_url.asp
I know that this is an old question, but it was the first answer on a Google search, so it seems relevant. I recently had this same problem. The solution in my case was to just use the Public Suffix List:
https://publicsuffix.org/learn/
The suggested language specific libraries listed should all allow for easy validation of not just domain format, but also top level domain validity.
If you can run shell commands, following is the best way to determine if a domain is registered.
This function returns false, if domain name isn't registered else returns domain name.
function get_domain_name($domain) {
//Step 1 - Return false if any shell sensitive chars or space/tab were found
if(escapeshellcmd($domain)!=$domain || count(explode(".", $domain))<2 || preg_match("/[\s\t]/", $domain)) {
return false;
}
//Step 2 - Get the root domain in-case of subdomain
$domain = (count(explode(".", $domain))>2 ? strtolower(explode(".", $domain)[count(explode(".", $domain))-2].".".explode(".", $domain)[count(explode(".", $domain))-1]) : strtolower($domain));
//Step 3 - Run shell command 'dig' to get SOA servers for the domain extension
$ns = shell_exec(escapeshellcmd("dig +short SOA ".escapeshellarg(explode(".", $domain)[count(explode(".", $domain))-1])));
//Step 4 - Return false if invalid extension (returns NULL), or take the first server address out of output
if($ns===NULL) {
return false;
}
$ns = (((preg_split('/\s+/', $ns)[0])[strlen(preg_split('/\s+/', $ns)[0])-1]==".") ? substr(preg_split('/\s+/', $ns)[0], 0, strlen(preg_split('/\s+/', $ns)[0])-1) : preg_split('/\s+/', $ns)[0]);
//Step 5 - Run another dig using the obtained address for our domain, and return false if returned NULL else return the domain name. This assumes an authoritative NS is assigned when a domain is registered, can be improved to filter more accurately.
$ans = shell_exec(escapeshellcmd("dig +noall +authority ".escapeshellarg("@".$ns)." ".escapeshellarg($domain)));
return (($ans===NULL) ? false : ((strpos($ans, $ns)>-1) ? false : $domain));
}
Pros
- Works on any domain, while php dns functions may fail on some domains. (my .pro domain failed on php dns)
- Works on fresh domains without any dns (like A) records
- Unicode friendly
Cons
- Usage of shell execution, probably
<?php
if(is_valid_domain('https://www.google.com')==1){
echo 'Valid';
}else{
echo 'InValid';
}
function is_valid_domain($url){
$validation = FALSE;
/*Parse URL*/
$urlparts = parse_url(filter_var($url, FILTER_SANITIZE_URL));
/*Check host exist else path assign to host*/
if(!isset($urlparts['host'])){
$urlparts['host'] = $urlparts['path'];
}
if($urlparts['host']!=''){
/*Add scheme if not found*/ if (!isset($urlparts['scheme'])){
$urlparts['scheme'] = 'http';
}
/*Validation*/
if(checkdnsrr($urlparts['host'], 'A') && in_array($urlparts['scheme'],array('http','https')) && ip2long($urlparts['host']) === FALSE){
$urlparts['host'] = preg_replace('/^www\./', '', $urlparts['host']);
$url = $urlparts['scheme'].'://'.$urlparts['host']. "/";
if (filter_var($url, FILTER_VALIDATE_URL) !== false && @get_headers($url)) {
$validation = TRUE;
}
}
}
return $validation;
}
?>
Check the php function checkdnsrr
function validate_email($email){
$exp = "^[a-z\'0-9]+([._-][a-z\'0-9]+)*@([a-z0-9]+([._-][a-z0-9]+))+$";
if(eregi($exp,$email)){
if(checkdnsrr(array_pop(explode("@",$email)),"MX")){
return true;
}else{
return false;
}
}else{
return false;
}
}
This is validation of domain name in javascript:
<script>
function frmValidate() {
var val=document.frmDomin.name.value;
if (/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9](?:\.[a-zA-Z]{2,})+$/.test(val)){
alert("Valid Domain Name");
return true;
} else {
alert("Enter Valid Domain Name");
val.name.focus();
return false;
}
}
</script>
This is simple. Some php egnine has a problem with split(). This code below will work.
<?php
$email = "vladimiroliva@ymail.com";
$domain = strtok($email, "@");
$domain = strtok("@");
if (@getmxrr($domain,$mxrecords))
echo "This ". $domain." EXIST!";
else
echo "This ". $domain." does not exist!";
?>
来源:https://stackoverflow.com/questions/1755144/how-to-validate-domain-name-in-php