How to remove scheme from url in Python?

后端未结

关注

 3  2041

I am working with an application that returns urls, written with Flask. I want the URL displayed to the user to be as clean as possible so I want t

相关标签:

3条回答

闹比i

2021-01-12 17:11

I've seen this done in Flask libraries and extensions. Worth noting you can do it although it does make use of a protected member (._replace) of the ParseResult/SplitResult.

url = 'HtTp://stackoverflow.com/questions/tagged/python?page=2'
split_url = urlsplit(url) 
# >>> SplitResult(scheme='http', netloc='stackoverflow.com', path='/questions/tagged/python', query='page=2', fragment='')
split_url_without_scheme = split_url._replace(scheme="")
# >>> SplitResult(scheme='', netloc='stackoverflow.com', path='/questions/tagged/python', query='page=2', fragment='')
new_url = urlunsplit(split_url_without_scheme)

0 讨论(0)

无人共我

2021-01-12 17:18
If you are using these programmatically rather than using a replace, I suggest having urlparse recreate the url without a scheme.

The ParseResult object is a tuple. So you can create another removing the fields you don't want.
```
# py2/3 compatibility
try:
    from urllib.parse import urlparse, ParseResult
except ImportError:
    from urlparse import urlparse, ParseResult


def strip_scheme(url):
    parsed_result = urlparse(url)
    return ParseResult('', *parsed_result[1:]).geturl()
```
You can remove any component of the parsedresult by simply replacing the input with an empty string.

It's important to note there is a functional difference between this answer and @Lukas Graf's answer. The most likely functional difference is that the '//' component of a url isn't technically the scheme, so this answer will preserve it, whereas it will remain here.
```
>>> Lukas_strip_scheme('https://yoman/hi?whatup')
'yoman/hi?whatup'
>>> strip_scheme('https://yoman/hi?whatup')
'//yoman/hi?whatup'
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
醉话见心

2021-01-12 17:22
I don't think urlparse offers a single method or function for this. This is how I'd do it:
```
from urlparse import urlparse

url = 'HtTp://stackoverflow.com/questions/tagged/python?page=2'

def strip_scheme(url):
    parsed = urlparse(url)
    scheme = "%s://" % parsed.scheme
    return parsed.geturl().replace(scheme, '', 1)

print strip_scheme(url)
```
Output:
```
stackoverflow.com/questions/tagged/python?page=2
```
If you'd use (only) simple string parsing, you'd have to deal with http[s], and possibly other schemes yourself. Also, this handles weird casing of the scheme.
0 讨论(0)
发布评论:

提交评论
- 加载中...