Get subdomain from URL using Python

前端 未结 8 992
孤城傲影
孤城傲影 2020-12-20 13:00

For example, the address is:

Address = http://lol1.domain.com:8888/some/page

相关标签:
8条回答
  • 2020-12-20 13:37

    tldextract separate the TLD from the registered domain and subdomains of a URL.

    Installation

    pip install tldextract
    

    For the current question:

    import tldextract
    
    address = 'http://lol1.domain.com:8888/some/page'
    domain = tldextract.extract(address).domain
    print("Extracted domain name : ", domain)
    

    The output:

    Extracted domain name :  domain
    

    In addition, there is a number of examples which is extremely related with the usage of tldextract.extract side.

    0 讨论(0)
  • 2020-12-20 13:38

    For extracting the hostname, I'd use urlparse from urllib2:

    >>> from urllib2 import urlparse
    >>> a = "http://lol1.domain.com:8888/some/page"
    >>> urlparse.urlparse(a).hostname
    'lol1.domain.com'
    

    As to how to extract the subdomain, you need to cover for the case that there FQDN could be longer. How you do this would depend on your purposes. I might suggest stripping off the two right most components.

    E.g.

    >>> urlparse.urlparse(a).hostname.rpartition('.')[0].rpartition('.')[0]
    'lol1'
    
    0 讨论(0)
提交回复
热议问题