how to decode and encode web page with python?

前端未结

关注

 3  2155

慢半拍i 2021-01-07 06:19

I use Beautifulsoup and urllib2 to download web pages, but different web page has a different encode method, such as utf-8,gb2312,gbk. I use urllib2 get sohu\'s home page, w

3条回答

离开以前 (楼主)

2021-01-07 06:33

Another solution.

from simplified_scrapy.request import req
from simplified_scrapy.simplified_doc import SimplifiedDoc
html = req.get('http://www.sohu.com') # This will automatically help you find the correct encoding
doc = SimplifiedDoc(html)
print (doc.title.text)

0 讨论(0)

查看其它3个回答