How can I get the base of a URL in Python?

前端 未结 8 2484
情书的邮戳
情书的邮戳 2021-02-12 12:34

I\'m trying to determine the base of a URL, or everything besides the page and parameters. I tried using split, but is there a better way than splitting it up into pieces? Is th

8条回答
  •  一生所求
    2021-02-12 13:22

    Agree that best way to do it is with urllib.parse

    Specifically, you can decompose the url with urllib.parse.urlparse and then replace every attribute other than scheme and netloc with an empty string. If you want to keep the path attribute (as in your question), you can do so with an extra string parsing step. Example function below:

    import urllib.parse
    def base_url(url, with_path=False):
        parsed = urllib.parse.urlparse(url)
        path   = '/'.join(parsed.path.split('/')[:-1]) if with_path else ''
        parsed = parsed._replace(path=path)
        parsed = parsed._replace(params='')
        parsed = parsed._replace(query='')
        parsed = parsed._replace(fragment='')
        return parsed.geturl()
    

    Examples:

    >>> base_url('http://127.0.0.1/asdf/login.php', with_path=True)
    'http://127.0.0.1/asdf'
    >>> base_url('http://127.0.0.1/asdf/login.php')
    'http://127.0.0.1'
    

提交回复
热议问题