How to validate domain name in PHP?

泄露秘密 提交于 2019-11-26 02:57:04

问题


Is it possible without using regular expression?

For example, I want to check that a string is a valid domain:

domain-name
abcd
example

Are valid domains. These are invalid of course:

domaia@name
ab$%cd

And so on. So basically it should start with an alphanumeric character, then there may be more alnum characters plus also a hyphen. And it must end with an alnum character, too.

If it\'s not possible, could you suggest me a regexp pattern to do this?

EDIT:

Why doesn\'t this work? Am I using preg_match incorrectly?

$domain = \'@djkal\';
$regexp = \'/^[a-zA-Z0-9][a-zA-Z0-9\\-\\_]+[a-zA-Z0-9]$/\';
if (false === preg_match($regexp, $domain)) {
    throw new Exception(\'Domain invalid\');
}

回答1:


<?php
function is_valid_domain_name($domain_name)
{
    return (preg_match("/^([a-z\d](-*[a-z\d])*)(\.([a-z\d](-*[a-z\d])*))*$/i", $domain_name) //valid chars check
            && preg_match("/^.{1,253}$/", $domain_name) //overall length check
            && preg_match("/^[^\.]{1,63}(\.[^\.]{1,63})*$/", $domain_name)   ); //length of each label
}
?>

Test cases:

is_valid_domain_name? [a]                       Y
is_valid_domain_name? [0]                       Y
is_valid_domain_name? [a.b]                     Y
is_valid_domain_name? [localhost]               Y
is_valid_domain_name? [google.com]              Y
is_valid_domain_name? [news.google.co.uk]       Y
is_valid_domain_name? [xn--fsqu00a.xn--0zwm56d] Y
is_valid_domain_name? [goo gle.com]             N
is_valid_domain_name? [google..com]             N
is_valid_domain_name? [google.com ]             N
is_valid_domain_name? [google-.com]             N
is_valid_domain_name? [.google.com]             N
is_valid_domain_name? [<script]                 N
is_valid_domain_name? [alert(]                  N
is_valid_domain_name? [.]                       N
is_valid_domain_name? [..]                      N
is_valid_domain_name? [ ]                       N
is_valid_domain_name? [-]                       N
is_valid_domain_name? []                        N



回答2:


With this you will not only be checking if the domain has a valid format, but also if it is active / has an IP address assigned to it.

$domain = "stackoverflow.com";

if(filter_var(gethostbyname($domain), FILTER_VALIDATE_IP))
{
    return TRUE;
}

Note that this method requires the DNS entries to be active so if you require a domain string to be validated without being in the DNS use the regular expression method given by velcrow above.

Also this function is not intended to validate a URL string use FILTER_VALIDATE_URL for that. We do not use FILTER_VALIDATE_URL for a domain because a domain string is not a valid URL.




回答3:


PHP 7

// Validate a domain name
var_dump(filter_var('mandrill._domainkey.mailchimp.com', FILTER_VALIDATE_DOMAIN));
# string(33) "mandrill._domainkey.mailchimp.com"

// Validate an hostname (here, the underscore is invalid)
var_dump(filter_var('mandrill._domainkey.mailchimp.com', FILTER_VALIDATE_DOMAIN, FILTER_FLAG_HOSTNAME));
# bool(false)

It is not documented here: http://www.php.net/filter.filters.validate and a bug request for this is located here: https://bugs.php.net/bug.php?id=72013




回答4:


Firstly, you should clarify whether you mean:

  1. individual domain name labels
  2. entire domain names (i.e. multiple dot-separate labels)
  3. host names

The reason the distinction is necessary is that a label can technically include any characters, including the NUL, @ and '.' characters. DNS is 8-bit capable and it's perfectly possible to have a zone file containing an entry reading "an\0odd\.l@bel". It's not recommended of course, not least because people would have difficulty telling a dot inside a label from those separating labels, but it is legal.

However, URLs require a host name in them, and those are governed by RFCs 952 and 1123. Valid host names are a subset of domain names. Specifically only letters, digits and hyphen are allowed. Furthermore the first and last characters cannot be a hyphen. RFC 952 didn't permit a number for the first character, but RFC 1123 subsequently relaxed that.

Hence:

  • a - valid
  • 0 - valid
  • a- - invalid
  • a-b - valid
  • xn--dasdkhfsd - valid (punycode encoding of an IDN)

Off the top of my head I don't think it's possible to invalidate the a- example with a single simple regexp. The best I can come up with to check a single host label is:

if (preg_match('/^[a-z\d][a-z\d-]{0,62}$/i', $label) &&
   !preg_match('/-$/', $label))
{
    # label is legal within a hostname
}

To further complicate matters, some domain name entries (typically SRV records) use labels prefixed with an underscore, e.g. _sip._udp.example.com. These are not host names, but are legal domain names.




回答5:


use checkdnsrr http://php.net/manual/en/function.checkdnsrr.php

$domain = "stackoverflow.com";

checkdnsrr($domain , "A");

//returns true if has a dns A record, false otherwise



回答6:


I think once you have isolated the domain name, say, using Erklan's idea:

$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];

you could use :

if( false === filter_var( $myDomainName, FILTER_VALIDATE_URL ) ) {
// failed test

}

PHP5s Filter functions are for just such a purpose I would have thought.

It does not strictly answer your question as it does not use Regex, I realise.




回答7:


Here is another way without regex.

$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];
$ipAddress = gethostbyname($myDomainName);
if($ipAddress == $myDomainName)
{
   echo "There is no url";
}
else
{
   echo "url found";
}



回答8:


Regular expression is the most effective way of checking for a domain validation. If you're dead set on not using a Regular Expression (which IMO is stupid), then you could split each part of a domain:

  • www. / sub-domain
  • domain name
  • .extension

You would then have to check each character in some sort of a loop to see that it matches a valid domain.

Like I said, it's much more effective to use a regular expression.




回答9:


Your regular expression is fine, but you're not using preg_match right. It returns an int (0 or 1), not a boolean. Just write if(!preg_match($regex, $string)) { ... }




回答10:


If you don't want to use regular expressions, you can try this:

$str = 'domain-name';

if (ctype_alnum(str_replace('-', '', $str)) && $str[0] != '-' && $str[strlen($str) - 1] != '-') {
    echo "Valid domain\n";
} else {
    echo "Invalid domain\n";
}

but as said regexp are the best tool for this.




回答11:


If you want to check whether a particular domain name or ip address exists or not, you can also use checkdnsrr
Here is the doc http://php.net/manual/en/function.checkdnsrr.php




回答12:


A valid domain is for me something I'm able to register or at least something that looks like I could register it. This is the reason why I like to separate this from "localhost"-names.

And finally I was interested in the main question if avoiding Regex would be faster and this is my result:

<?php
function filter_hostname($name, $domain_only=false) {
    // entire hostname has a maximum of 253 ASCII characters
    if (!($len = strlen($name)) || $len > 253
    // .example.org and localhost- are not allowed
    || $name[0] == '.' || $name[0] == '-' || $name[ $len - 1 ] == '.' || $name[ $len - 1 ] == '-'
    // a.de is the shortest possible domain name and needs one dot
    || ($domain_only && ($len < 4 || strpos($name, '.') === false))
    // several combinations are not allowed
    || strpos($name, '..') !== false
    || strpos($name, '.-') !== false
    || strpos($name, '-.') !== false
    // only letters, numbers, dot and hypen are allowed
/*
    // a little bit slower
    || !ctype_alnum(str_replace(array('-', '.'), '', $name))
*/
    || preg_match('/[^a-z\d.-]/i', $name)
    ) {
        return false;
    }
    // each label may contain up to 63 characters
    $offset = 0;
    while (($pos = strpos($name, '.', $offset)) !== false) {
        if ($pos - $offset > 63) {
            return false;
        }
        $offset = $pos + 1;
    }
    return $name;
}
?>

Benchmark results compared with velcrow 's function and 10000 iterations (complete results contains many code variants. It was interesting to find the fastest.):

filter_hostname($domain);// $domains: 0.43556308746338 $real_world: 0.33749794960022
is_valid_domain_name($domain);// $domains: 0.81832790374756 $real_world: 0.32248711585999

$real_world did not contain extreme long domain names to produce better results. And now I can answer your question: With the usage of ctype_alnum() it would be possible to realize it without regex, but as preg_match() was faster I would prefer that.

If you don't like the fact that "local.host" is a valid domain name use this function instead that valids against a public tld list. Maybe someone finds the time to combine both.




回答13:


The correct answer is that you don't ... you let a unit tested tool do the work for you:

// return '' if host invalid --
private function setHostname($host = '')
{
    $ret = (!empty($host)) ? $host : '';
    if(filter_var('http://'.$ret.'/', FILTER_VALIDATE_URL) === false) {
        $ret = '';
    }
    return $ret;
}

further reading :https://www.w3schools.com/php/filter_validate_url.asp




回答14:


I know that this is an old question, but it was the first answer on a Google search, so it seems relevant. I recently had this same problem. The solution in my case was to just use the Public Suffix List:

https://publicsuffix.org/learn/

The suggested language specific libraries listed should all allow for easy validation of not just domain format, but also top level domain validity.




回答15:


If you can run shell commands, following is the best way to determine if a domain is registered.

This function returns false, if domain name isn't registered else returns domain name.

function get_domain_name($domain) { 
    //Step 1 - Return false if any shell sensitive chars or space/tab were found
    if(escapeshellcmd($domain)!=$domain || count(explode(".", $domain))<2 || preg_match("/[\s\t]/", $domain)) {
            return false;
    }

    //Step 2 - Get the root domain in-case of subdomain
    $domain = (count(explode(".", $domain))>2 ? strtolower(explode(".", $domain)[count(explode(".", $domain))-2].".".explode(".", $domain)[count(explode(".", $domain))-1]) : strtolower($domain));

    //Step 3 - Run shell command 'dig' to get SOA servers for the domain extension
    $ns = shell_exec(escapeshellcmd("dig +short SOA ".escapeshellarg(explode(".", $domain)[count(explode(".", $domain))-1]))); 

    //Step 4 - Return false if invalid extension (returns NULL), or take the first server address out of output
    if($ns===NULL) {
            return false;
    }
    $ns = (((preg_split('/\s+/', $ns)[0])[strlen(preg_split('/\s+/', $ns)[0])-1]==".") ? substr(preg_split('/\s+/', $ns)[0], 0, strlen(preg_split('/\s+/', $ns)[0])-1) : preg_split('/\s+/', $ns)[0]);

    //Step 5 - Run another dig using the obtained address for our domain, and return false if returned NULL else return the domain name. This assumes an authoritative NS is assigned when a domain is registered, can be improved to filter more accurately.
    $ans = shell_exec(escapeshellcmd("dig +noall +authority ".escapeshellarg("@".$ns)." ".escapeshellarg($domain))); 
    return (($ans===NULL) ? false : ((strpos($ans, $ns)>-1) ? false : $domain));
}

Pros

  1. Works on any domain, while php dns functions may fail on some domains. (my .pro domain failed on php dns)
  2. Works on fresh domains without any dns (like A) records
  3. Unicode friendly

Cons

  1. Usage of shell execution, probably



回答16:


<?php

if(is_valid_domain('https://www.google.com')==1){
  echo 'Valid';
}else{
   echo 'InValid';
}

 function is_valid_domain($url){

    $validation = FALSE;
    /*Parse URL*/    
    $urlparts = parse_url(filter_var($url, FILTER_SANITIZE_URL));

    /*Check host exist else path assign to host*/    
    if(!isset($urlparts['host'])){
        $urlparts['host'] = $urlparts['path'];
    }

    if($urlparts['host']!=''){
        /*Add scheme if not found*/        if (!isset($urlparts['scheme'])){
        $urlparts['scheme'] = 'http';
        }

        /*Validation*/        
    if(checkdnsrr($urlparts['host'], 'A') && in_array($urlparts['scheme'],array('http','https')) && ip2long($urlparts['host']) === FALSE){ 
        $urlparts['host'] = preg_replace('/^www\./', '', $urlparts['host']);
        $url = $urlparts['scheme'].'://'.$urlparts['host']. "/";            

            if (filter_var($url, FILTER_VALIDATE_URL) !== false && @get_headers($url)) {
                $validation = TRUE;
            }
        }
    }

    return $validation;

}
?>



回答17:


Check the php function checkdnsrr

function validate_email($email){

   $exp = "^[a-z\'0-9]+([._-][a-z\'0-9]+)*@([a-z0-9]+([._-][a-z0-9]+))+$";

   if(eregi($exp,$email)){

      if(checkdnsrr(array_pop(explode("@",$email)),"MX")){
        return true;
      }else{
        return false;
      }

   }else{

      return false;

   }   
}



回答18:


This is validation of domain name in javascript:

<script>
function frmValidate() {
 var val=document.frmDomin.name.value;
 if (/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9](?:\.[a-zA-Z]{2,})+$/.test(val)){
      alert("Valid Domain Name");
      return true;
 } else {
      alert("Enter Valid Domain Name");
      val.name.focus();
      return false;
 }
}
</script>



回答19:


This is simple. Some php egnine has a problem with split(). This code below will work.

<?php
$email = "vladimiroliva@ymail.com"; 
$domain = strtok($email, "@");
$domain = strtok("@");
if (@getmxrr($domain,$mxrecords)) 
   echo "This ". $domain." EXIST!"; 
else 
   echo "This ". $domain." does not exist!"; 
?>



来源:https://stackoverflow.com/questions/1755144/how-to-validate-domain-name-in-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!