Remove domain extension

前端 未结 5 1093
情书的邮戳
情书的邮戳 2020-11-30 14:10

So let\'s say I have just-a.domain.com,just-a-domain.info,just.a-domain.net how can I remove the extension .com,.net.info ... and I need the result

相关标签:
5条回答
  • 2020-11-30 14:33

    Regex and parse_url() aren't solution for you.

    You need package that uses Public Suffix List, only in this way you can correctly extract domains with two-, third-level TLDs (co.uk, a.bg, b.bg, etc.). I recomend use TLD Extract.

    Here example of code:

    $extract = new LayerShifter\TLDExtract\Extract();
    
    $result = $extract->parse('just.a-domain.net');
    $result->getSubdomain(); // will return (string) 'just'
    $result->getHostname(); // will return (string) 'a-domain'
    $result->getSuffix(); // will return (string) 'net'
    $result->getRegistrableDomain(); // will return (string) 'a-domain.net'
    
    0 讨论(0)
  • 2020-11-30 14:36
    strrpos($str, ".")
    

    Will give you the index for the last period in your string, then you can use substr() with the index and return the short string.

    0 讨论(0)
  • 2020-11-30 14:38

    If you want to remove the part of the domain that is administrated by domain name registrars, you will need to use a list of such suffixes like the Public Suffix List.

    But since a walk through this list and testing the suffix on the domain name is not that efficient, rather use this list only to build an index like this:

    $tlds = array(
        // ac : http://en.wikipedia.org/wiki/.ac
        'ac',
        'com.ac',
        'edu.ac',
        'gov.ac',
        'net.ac',
        'mil.ac',
        'org.ac',
        // ad : http://en.wikipedia.org/wiki/.ad
        'ad',
        'nom.ad',
        // …
    );
    $tldIndex = array_flip($tlds);
    

    Searching for the best match would then go like this:

    $levels = explode('.', $domain);
    for ($length=1, $n=count($levels); $length<=$n; ++$length) {
        $suffix = implode('.', array_slice($levels, -$length));
        if (!isset($tldIndex[$suffix])) {
            $length--;
            break;
        }
    }
    $suffix = implode('.', array_slice($levels, -$length));
    $prefix = substr($domain, 0, -strlen($suffix) - 1);
    

    Or build a tree that represents the hierarchy of the domain name levels as follows:

    $tldTree = array(
        // ac : http://en.wikipedia.org/wiki/.ac
        'ac' => array(
            'com' => true,
            'edu' => true,
            'gov' => true,
            'net' => true,
            'mil' => true,
            'org' => true,
         ),
         // ad : http://en.wikipedia.org/wiki/.ad
         'ad' => array(
            'nom' => true,
         ),
         // …
    );
    

    Then you can use the following to find the match:

    $levels = explode('.', $domain);
    $r = &$tldTree;
    $length = 0;
    foreach (array_reverse($levels) as $level) {
        if (isset($r[$level])) {
            $r = &$r[$level];
            $length++;
        } else {
            break;
        }
    }
    $suffix = implode('.', array_slice($levels, - $length));
    $prefix = substr($domain, 0, -strlen($suffix) - 1);
    
    0 讨论(0)
  • 2020-11-30 14:50
      preg_match('/(.*?)((?:\.co)?.[a-z]{2,4})$/i', $domain, $matches);
    

    $matches[1] will have the domain and $matches[2] will have the extension

    <?php
    
    $domains = array("google.com", "google.in", "google.co.in", "google.info", "analytics.google.com");
    
    foreach($domains as $domain){
      preg_match('/(.*?)((?:\.co)?.[a-z]{2,4})$/i', $domain, $matches);
      print_r($matches);
    }
    ?>
    

    Will produce the output

    Array
    (
        [0] => google.com
        [1] => google
        [2] => .com
    )
    Array
    (
        [0] => google.in
        [1] => google
        [2] => .in
    )
    Array
    (
        [0] => google.co.in
        [1] => google
        [2] => .co.in
    )
    Array
    (
        [0] => google.info
        [1] => google
        [2] => .info
    )
    Array
    (
        [0] => analytics.google.com
        [1] => analytics.google
        [2] => .com
    )
    
    0 讨论(0)
  • 2020-11-30 14:53
    $subject = 'just-a.domain.com';
    $result = preg_split('/(?=\.[^.]+$)/', $subject);
    

    This produces the following array

    $result[0] == 'just-a.domain';
    $result[1] == '.com';
    
    0 讨论(0)
提交回复
热议问题