How to join absolute and relative urls?

后端 未结 6 1471
谎友^
谎友^ 2020-12-02 12:07

I have two urls:

url1 = \"http://127.0.0.1/test1/test2/test3/test5.xml\"
url2 = \"../../test4/test6.xml\"

How can I get an absolute url for

相关标签:
6条回答
  • 2020-12-02 12:19
    >>> from urlparse import urljoin
    >>> url1 = "http://www.youtube.com/user/khanacademy"
    >>> url2 = "/user/khanacademy"
    >>> urljoin(url1, url2)
    'http://www.youtube.com/user/khanacademy'
    

    Simple.

    0 讨论(0)
  • 2020-12-02 12:19

    You can use reduce to achieve Shikhar's method in a cleaner fashion.

    >>> import urllib.parse
    >>> from functools import reduce
    >>> reduce(urllib.parse.urljoin, ["http://moc.com/", "path1/", "path2/", "path3/"])
    'http://moc.com/path1/path2/path3/'
    

    Note that with this method, each fragment should have trailing forward-slash, with no leading forward-slash (to indicate it is a path fragment being joined). This is more correct/informative, telling you that path1/ is a URI path fragment, and not the full path /path1/ or an unknown path1, which could be either (and gets treated as a full path).

    If you need to add / to a fragment lacking it, you could do:

    uri = uri if uri.endswith("/") else f"{uri}/"
    

    To learn more about URI resolution, Wikipedia has some nice examples.

    update

    Just notices Peter Perron commented about reduce on Shikhar's answer, but I'll leave this here then to demonstrate how that's done.

    0 讨论(0)
  • 2020-12-02 12:21

    You should use urlparse.urljoin :

    >>> import urlparse
    >>> urlparse.urljoin(url1, url2)
    'http://127.0.0.1/test1/test4/test6.xml'
    

    With Python 3 (where urlparse is renamed to urllib.parse) you could use it as follow:

    >>> import urllib.parse
    >>> urllib.parse.urljoin(url1, url2)
    'http://127.0.0.1/test1/test4/test6.xml'
    
    0 讨论(0)
  • 2020-12-02 12:21

    If your relative path consists of multiple parts, you have to join them separately, since urljoin would replace the relative path, not join it. The easiest way to do that is to use posixpath.

    >>> import urllib.parse
    >>> import posixpath
    >>> url1 = "http://127.0.0.1"
    >>> url2 = "test1"
    >>> url3 = "test2"
    >>> url4 = "test3"
    >>> url5 = "test5.xml"
    >>> url_path = posixpath.join(url2, url3, url4, url5)
    >>> urllib.parse.urljoin(url1, url_path)
    'http://127.0.0.1/test1/test2/test3/test5.xml'
    

    See also: How to join components of a path when you are constructing a URL in Python

    0 讨论(0)
  • 2020-12-02 12:28

    For python 3.0+ the correct way to join urls is:

    from urllib.parse import urljoin
    urljoin('https://10.66.0.200/', '/api/org')
    # output : 'https://10.66.0.200/api/org'
    
    0 讨论(0)
  • 2020-12-02 12:29
    es = ['http://127.0.0.1', 'test1', 'test4', 'test6.xml']
    base = ''
    map(lambda e: urlparse.urljoin(base, e), es)
    
    0 讨论(0)
提交回复
热议问题