How to deal with unicode string in URL in python3?

后端 未结 3 1204
我寻月下人不归
我寻月下人不归 2020-12-03 13:57
# -*- coding: utf-8 -*- 
# Python3
import urllib
import urllib.request as url_req
opener = url_req.build_opener()
url=\'http://zh.wikipedia.org/wiki/\'+\"毛泽东\"
opene         


        
相关标签:
3条回答
  • 2020-12-03 14:38

    You can not use arbitrary unicode strings as part of an URL. The URL must be properly encoded. See here for details:

    http://www.w3schools.com/tags/ref_urlencode.asp

    In particular you want to use the urllib.quote() or urllib.quote_plus() method of the Python API for quoting your URL properly.

    http://docs.python.org/library/urllib.html

    0 讨论(0)
  • 2020-12-03 14:40

    You could use urllib.parse.quote() to encode the path section of URL.

    #!/usr/bin/env python3
    from urllib.parse   import quote
    from urllib.request import urlopen
    
    url = 'http://zh.wikipedia.org/wiki/' + quote("毛泽东")
    content = urlopen(url).read()
    
    0 讨论(0)
  • 2020-12-03 14:51

    The fantastic requests library does this for you out of the box:

    >>> url='http://zh.wikipedia.org/wiki/'+"毛泽东"
    >>> import requests
    >>> r = requests.get(url)
    >>> len(r.content)
    818747
    
    0 讨论(0)
提交回复
热议问题