Parsing domain from a URL

前端 未结 18 2196
独厮守ぢ
独厮守ぢ 2020-11-22 12:26

I need to build a function which parses the domain from a URL.

So, with

http://google.com/dhasjkdas/sadsdds/sdda/sdads.html

or

相关标签:
18条回答
  • 2020-11-22 12:34

    From http://us3.php.net/manual/en/function.parse-url.php#93983

    for some odd reason, parse_url returns the host (ex. example.com) as the path when no scheme is provided in the input url. So I've written a quick function to get the real host:

    function getHost($Address) { 
       $parseUrl = parse_url(trim($Address)); 
       return trim($parseUrl['host'] ? $parseUrl['host'] : array_shift(explode('/', $parseUrl['path'], 2))); 
    } 
    
    getHost("example.com"); // Gives example.com 
    getHost("http://example.com"); // Gives example.com 
    getHost("www.example.com"); // Gives www.example.com 
    getHost("http://example.com/xyz"); // Gives example.com 
    
    0 讨论(0)
  • 2020-11-22 12:34
    $domain = parse_url($url, PHP_URL_HOST);
    echo implode('.', array_slice(explode('.', $domain), -2, 2))
    
    0 讨论(0)
  • 2020-11-22 12:36

    I have edited for you:

    function getHost($Address) { 
        $parseUrl = parse_url(trim($Address));
        $host = trim($parseUrl['host'] ? $parseUrl['host'] : array_shift(explode('/', $parseUrl['path'], 2))); 
    
        $parts = explode( '.', $host );
        $num_parts = count($parts);
    
        if ($parts[0] == "www") {
            for ($i=1; $i < $num_parts; $i++) { 
                $h .= $parts[$i] . '.';
            }
        }else {
            for ($i=0; $i < $num_parts; $i++) { 
                $h .= $parts[$i] . '.';
            }
        }
        return substr($h,0,-1);
    }
    

    All type url (www.domain.ltd, sub1.subn.domain.ltd will result to : domain.ltd.

    0 讨论(0)
  • 2020-11-22 12:37

    You can pass PHP_URL_HOST into parse_url function as second parameter

    $url = 'http://google.com/dhasjkdas/sadsdds/sdda/sdads.html';
    $host = parse_url($url, PHP_URL_HOST);
    print $host; // prints 'google.com'
    
    0 讨论(0)
  • 2020-11-22 12:38

    Here is the code i made that 100% finds only the domain name, since it takes mozilla sub tlds to account. Only thing you have to check is how you make cache of that file, so you dont query mozilla every time.

    For some strange reason, domains like co.uk are not in the list, so you have to make some hacking and add them manually. Its not cleanest solution but i hope it helps someone.

    //=====================================================
    static function domain($url)
    {
        $slds = "";
        $url = strtolower($url);
    
                $address = 'http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1';
        if(!$subtlds = @kohana::cache('subtlds', null, 60)) 
        {
            $content = file($address);
            foreach($content as $num => $line)
            {
                $line = trim($line);
                if($line == '') continue;
                if(@substr($line[0], 0, 2) == '/') continue;
                $line = @preg_replace("/[^a-zA-Z0-9\.]/", '', $line);
                if($line == '') continue;  //$line = '.'.$line;
                if(@$line[0] == '.') $line = substr($line, 1);
                if(!strstr($line, '.')) continue;
                $subtlds[] = $line;
                //echo "{$num}: '{$line}'"; echo "<br>";
            }
            $subtlds = array_merge(Array(
                'co.uk', 'me.uk', 'net.uk', 'org.uk', 'sch.uk', 'ac.uk', 
                'gov.uk', 'nhs.uk', 'police.uk', 'mod.uk', 'asn.au', 'com.au',
                'net.au', 'id.au', 'org.au', 'edu.au', 'gov.au', 'csiro.au',
                ),$subtlds);
    
            $subtlds = array_unique($subtlds);
            //echo var_dump($subtlds);
            @kohana::cache('subtlds', $subtlds);
        }
    
    
        preg_match('/^(http:[\/]{2,})?([^\/]+)/i', $url, $matches);
        //preg_match("/^(http:\/\/|https:\/\/|)[a-zA-Z-]([^\/]+)/i", $url, $matches);
        $host = @$matches[2];
        //echo var_dump($matches);
    
        preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
        foreach($subtlds as $sub) 
        {
            if (preg_match("/{$sub}$/", $host, $xyz))
            preg_match("/[^\.\/]+\.[^\.\/]+\.[^\.\/]+$/", $host, $matches);
        }
    
        return @$matches[0];
    }
    
    0 讨论(0)
  • 2020-11-22 12:41

    Just use as like following ...

    <?php
       echo $_SERVER['SERVER_NAME'];
    ?>
    
    0 讨论(0)
提交回复
热议问题