How can I get the base of a URL in Python?

前端 未结 8 2478
情书的邮戳
情书的邮戳 2021-02-12 12:34

I\'m trying to determine the base of a URL, or everything besides the page and parameters. I tried using split, but is there a better way than splitting it up into pieces? Is th

8条回答
  •  执笔经年
    2021-02-12 13:16

    When you use urlsplit, it returns a SplitResult object:

    from urllib.parse import urlsplit
    split_url = urlsplit('http://127.0.0.1/asdf/login.php')
    print(split_url)
    
    >>> SplitResult(scheme='http' netloc='127.0.0.1' path='/asdf/login.php' query='' fragment='') 
    

    You can make your own SplitResult() object and pass it through urlunsplit. This code should work for multiple url splits, regardless of their length, as long as you know what the last path element you want is.

    from urllib.parse import urlsplit, urlunsplit, SplitResult
    
    # splitting url:
    split_url = urlsplit('http://127.0.0.1/asdf/login.php')
    
    # editing the variables you want to change (in this case, path):    
    last_element = 'asdf'   # this can be any element in the path.
    path_array = split_url.path.split('/')
    
    # print(path_array)
    # >>> ['', 'asdf', 'login.php']
    
    path_array.remove('') 
    ind = path_array.index(last_element) 
    new_path = '/' + '/'.join(path_array[:ind+1]) + '/'
    
    # making SplitResult() object with edited data:
    new_url = SplitResult(scheme=split_url.scheme, netloc=split_url.netloc, path=new_path, query='', fragment='')
    
    # unsplitting:
    base_url = urlunsplit(new_url)
    

提交回复
热议问题