How to validate a domain name using Regex & Php?

前端 未结 6 641
失恋的感觉
失恋的感觉 2020-12-03 05:27

I want a solution to validate only domain names not full urls, The following example is what i\'m looking for:

domain.com -> true
domain.net -> true
do         


        
相关标签:
6条回答
  • 2020-12-03 05:39

    In my case, domain name is considered as valid if the format is stackoverflow.com or xxx.stackoverflow.com

    So in addition to other stack answers, I have added checking for www. also.

    function isValidDomainName($domain) {
      if (filter_var(gethostbyname($domain), FILTER_VALIDATE_IP)) {
          return (preg_match('/^www./', $domain)) ? FALSE : TRUE;
      }
      return FALSE;
    }
    

    you can test the function with this code

        $domain = array("http://www.domain.com","http://www.domain.com/folder" ,"http://domain.com", "www.domain.com", "domain.com/subfolder", "domain.com","sub.domain.com");
        foreach ($domain as $v) {
            echo isValidDomainName($v) ? "{$v} is valid<br>" : "{$v} is invalid<br>";
        }
    
    0 讨论(0)
  • 2020-12-03 05:44

    How about:

    ^(?:[-A-Za-z0-9]+\.)+[A-Za-z]{2,6}$
    
    0 讨论(0)
  • 2020-12-03 05:44

    Please try this expression:

    ^(http[s]?\:\/\/)?((\w+)\.)?(([\w-]+)?)(\.[\w-]+){1,2}$
    

    What it actually does

    • optional http/s://
    • optional www
    • any valid alphanumeric name (including - and _)
    • 1 or 2 occurrences of any valid alphanumeric name (including - and _)

    Validation Examples

    • http://www.test.com
    • test.com.mt
    0 讨论(0)
  • 2020-12-03 05:45

    Remember, regexes can only check to see if something is well formed. "www.idonotexistbecauseiammadeuponthespot.com" is well-formed, but doesn't actually exist... at the time of writing. ;) Furthermore, certain free web hosting providers (like Tripod) allow underscores in subdomains. This is clearly a violation of the RFCs, yet it sometimes works.

    Do you want to check if the domain exists? Try dns_get_record instead of (just) a regex.

    0 讨论(0)
  • 2020-12-03 05:50

    I made a function to validate the domain name without any regex.

    <?php
    function validDomain($domain) {
      $domain = rtrim($domain, '.');
      if (!mb_stripos($domain, '.')) {
        return false;
      }
      $domain = explode('.', $domain);
      $allowedChars = array('-');
      $extenion = array_pop($domain);
      foreach ($domain as $value) {
        $fc = mb_substr($value, 0, 1);
        $lc = mb_substr($value, -1);
        if (
          hash_equals($value, '')
          || in_array($fc, $allowedChars)
          || in_array($lc, $allowedChars)
        ) {
          return false;
        }
        if (!ctype_alnum(str_replace($allowedChars, '', $value))) {
          return false;
        }
      }
      if (
        !ctype_alnum(str_replace($allowedChars, '', $extenion))
        || hash_equals($extenion, '')
      ) {
        return false;
      }
      return true;
    }
    $testCases = array(
      'a',
      '0',
      'a.b',
      'google.com',
      'news.google.co.uk',
      'xn--fsqu00a.xn--0zwm56d',
      'google.com ',
      'google.com.',
      'goo gle.com',
      'a.',
      'hey.hey',
      'google-.com',
      '-nj--9*.vom',
      ' ',
      '..',
      'google..com',
      'www.google.com',
      'www.google.com/some/path/to/dir/'
    );
    foreach ($testCases as $testCase) {
      var_dump($testCase);
      var_dump(validDomain($TestCase));
      echo '<br /><br />';
    }
    ?>
    

    This code outputs:

    string(1) "a" bool(false)

    string(1) "0" bool(false)

    string(3) "a.b" bool(true)

    string(10) "google.com" bool(true)

    string(17) "news.google.co.uk" bool(true)

    string(23) "xn--fsqu00a.xn--0zwm56d" bool(true)

    string(11) "google.com " bool(false)

    string(11) "google.com." bool(true)

    string(11) "goo gle.com" bool(false)

    string(2) "a." bool(false)

    string(7) "hey.hey" bool(true)

    string(11) "google-.com" bool(false)

    string(11) "-nj--9*.vom" bool(false)

    string(1) " " bool(false)

    string(2) ".." bool(false)

    string(11) "google..com" bool(false)

    string(14) "www.google.com" bool(true)

    string(32) "www.google.com/some/path/to/dir/" bool(false)

    I hope I have covered everything if I missed something please tell me and I can improve this function. :)

    0 讨论(0)
  • 2020-12-03 06:04

    The accepted answer is incomplete/wrong.

    The regex pattern;

    • should NOT validate domains such as:
      -domain.com, domain--.com, -domain-.-.com, domain.000, etc...

    • should validate domains such as:
      schools.k12, newTLD.clothing, good.photography, etc...

    After some further research; below is the most correct, cross-language and compact pattern I could come up with:

    ^(?!\-)(?:(?:[a-zA-Z\d][a-zA-Z\d\-]{0,61})?[a-zA-Z\d]\.){1,126}(?!\d+)[a-zA-Z\d]{1,63}$
    

    This pattern conforms with most* of the rules defined in the specs:

    • Each label/level (splitted by a dot) may contain up to 63 characters.
    • The full domain name may have up to 127 levels.
    • The full domain name may not exceed the length of 253 characters in its textual representation.
    • Each label can consist of letters, digits and hyphens.
    • Labels cannot start or end with a hyphen.
    • The top-level domain (extension) cannot be all-numeric.

    Note 1: The full domain length check is not included in the regex. It should be simply checked by native methods e.g. strlen(domain) <= 253.
    Note 2: This pattern works with most languages including PHP, Javascript, Python, etc...

    See DEMO here (for JS, PHP, Python)

    More Info:

    • The regex above does not support IDNs.

    • There is no spec that says the extension (TLD) should be between 2 and 6 characters. It actually supports 63 characters. See the current TLD list here. Also, some networks do internally use custom/pseudo TLDs.

    • Registration authorities might impose some extra, specific rules which are not explicitly supported in this regex. For example, .CO.UK and .ORG.UK must have at least 3 characters, but less than 23, not including the extension. These kinds of rules are non-standard and subject to change. Do not implement them if you cannot maintain.

    • Regular Expressions are great but not the best effective, performant solution to every problem. So a native URL parser should be used instead, whenever possible. e.g. Python's urlparse() method or PHP's parse_url() method...

    • After all, this is just a format validation. A regex test does not confirm that a domain name is actually configured/exists! You should test the existence by making a request.

    Specs & References:

    • IETF: RFC1035
    • IETF: RFC1123
    • IETF: RFC2181
    • IETF: RFC952
    • Wikipedia: Domain Name System

    UPDATE (2019-12-21): Fixed leading hyphen with subdomains.

    0 讨论(0)
提交回复
热议问题