How can I create a percent encoded URL from complete URL?

孤人 提交于 2019-12-12 03:37:04

问题


The input url is mixed ascii and multi byte code. And I can't change this strings. I would like to download this url, but an error occured ordinal not in range(128).

input_url = "http://sample.jp/api?v1=aaa&v2=日本語&v3=ccc"

req = urllib.request.Request(input_url)
resp = None
try:
    resp = urllib.request.urlopen(req)
except UnicodeEncodeError as e:
    print(e.reason)  # I had an error `ordinal not in range(128)`
return resp

First, I tried urllib.parse.quote() but the result is http%3a%2f%2fsample%2ejp%2fapi%3fv1%3daaa%26v2%3d%93%fa%96%7b%8c%ea%26v3%3dccc. I had another error ValueError: unknown url type. How can I resolve this problem? Or do you have some ideas?


回答1:


A combination of urllib, and urlparse should do it for you:

>>> urllib.urlencode(urlparse.parse_qsl(urlparse.urlparse(input_url).query))
'v1=aaa&v2=%E6%97%A5%E6%9C%AC%E8%AA%9E&v3=ccc'



回答2:


You need to encode those parameters to UTF-8 bytes and the bytes toURL percent encoding. You can do all this with the urllib.parse module:

from urllib.parse import urlparse, parse_qs, urlencode

parsed = urlparse(input_url)
query = parse_qs(parsed.query)
fixed_url = parsed._replace(query=urlencode(query, doseq=True)).geturl()

Demo:

>>> from urllib.parse import urlparse, parse_qs, urlencode
>>> input_url = "http://sample.jp/api?v1=aaa&v2=日本語&v3=ccc"
>>> parsed = urlparse(input_url)
>>> query = parse_qs(parsed.query)
>>> parsed._replace(query=urlencode(query, doseq=True)).geturl()
'http://sample.jp/api?v1=aaa&v2=%E6%97%A5%E6%9C%AC%E8%AA%9E&v3=ccc'
>>> import urllib.request
>>> urllib.request.urlopen(_)
<http.client.HTTPResponse object at 0x108f0f7b8>


来源:https://stackoverflow.com/questions/36010484/how-can-i-create-a-percent-encoded-url-from-complete-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!