How can I get the base of a URL in Python?

前端 未结 8 2472
情书的邮戳
情书的邮戳 2021-02-12 12:34

I\'m trying to determine the base of a URL, or everything besides the page and parameters. I tried using split, but is there a better way than splitting it up into pieces? Is th

相关标签:
8条回答
  • 2021-02-12 13:22

    Agree that best way to do it is with urllib.parse

    Specifically, you can decompose the url with urllib.parse.urlparse and then replace every attribute other than scheme and netloc with an empty string. If you want to keep the path attribute (as in your question), you can do so with an extra string parsing step. Example function below:

    import urllib.parse
    def base_url(url, with_path=False):
        parsed = urllib.parse.urlparse(url)
        path   = '/'.join(parsed.path.split('/')[:-1]) if with_path else ''
        parsed = parsed._replace(path=path)
        parsed = parsed._replace(params='')
        parsed = parsed._replace(query='')
        parsed = parsed._replace(fragment='')
        return parsed.geturl()
    

    Examples:

    >>> base_url('http://127.0.0.1/asdf/login.php', with_path=True)
    'http://127.0.0.1/asdf'
    >>> base_url('http://127.0.0.1/asdf/login.php')
    'http://127.0.0.1'
    
    0 讨论(0)
  • 2021-02-12 13:23

    If you use python3, you can use urlparse and urlunparse.

    In :from urllib.parse import urlparse, urlunparse
    
    In :url = "http://127.0.0.1/asdf/login.php"
    
    In :result = urlparse(url)
    
    In :new = list(result)
    
    In :new[2] = new[2].replace("login.php", "")
    
    In :urlunparse(new)
    Out:'http://127.0.0.1/asdf/'
    
    0 讨论(0)
提交回复
热议问题