问题
I have an URL like this:
https://user:password@example.com/path?key=value#hash
The result should be:
https://user:???@example.com/path?key=value#hash
I could use a regex, but instead I would like to parse the URL a high level data structure, then operate on this data structure, then serializing to a string.
Is this possible with Python?
回答1:
You can use the built in urlparse
to query out the password from a url. It is available in both Python 2 and 3, but under different locations.
Python 2 import urlparse
Python 3 from urllib.parse import urlparse
Example
from urllib.parse import urlparse
parsed = urlparse("https://user:password@example.com/path?key=value#hash")
parsed.password # 'password'
replaced = parsed._replace(netloc="{}:{}@{}".format(parsed.username, "???", parsed.hostname))
replaced.geturl() # 'https://user:???@example.com/path?key=value#hash'
See also this question: Changing hostname in a url
来源:https://stackoverflow.com/questions/46905367/redact-and-remove-password-from-url