How can I get the base of a URL in Python?

前端未结

关注

 8  2472

I\'m trying to determine the base of a URL, or everything besides the page and parameters. I tried using split, but is there a better way than splitting it up into pieces? Is th

相关标签:

8条回答

一生所求

2021-02-12 13:22

Agree that best way to do it is with urllib.parse

Specifically, you can decompose the url with urllib.parse.urlparse and then replace every attribute other than scheme and netloc with an empty string. If you want to keep the path attribute (as in your question), you can do so with an extra string parsing step. Example function below:

import urllib.parse
def base_url(url, with_path=False):
    parsed = urllib.parse.urlparse(url)
    path   = '/'.join(parsed.path.split('/')[:-1]) if with_path else ''
    parsed = parsed._replace(path=path)
    parsed = parsed._replace(params='')
    parsed = parsed._replace(query='')
    parsed = parsed._replace(fragment='')
    return parsed.geturl()

Examples:

>>> base_url('http://127.0.0.1/asdf/login.php', with_path=True)
'http://127.0.0.1/asdf'
>>> base_url('http://127.0.0.1/asdf/login.php')
'http://127.0.0.1'

0 讨论(0)

孤城傲影

2021-02-12 13:23

If you use python3, you can use urlparse and urlunparse.

In :from urllib.parse import urlparse, urlunparse

In :url = "http://127.0.0.1/asdf/login.php"

In :result = urlparse(url)

In :new = list(result)

In :new[2] = new[2].replace("login.php", "")

In :urlunparse(new)
Out:'http://127.0.0.1/asdf/'

0 讨论(0)

上一页 1 2