Python requests isn't giving me the same HTML as my browser is

前端未结

关注

 6  1015

失恋的感觉 2021-01-31 18:56

I am grabbing a Wikia page using Python requests. There\'s a problem, though: the requests request isn\'t giving me the same HTML as my browser is with the very

6条回答

孤街浪徒 (楼主)

2021-01-31 19:19

Maybe Requests and Browsers use different ways to render the raw data from WEB server, and the diff in the above example are only with the rendered html.

I found that when html is broken, different browsers, e.g. Chrome and Safari, use different ways to fix when parsing. So maybe it is the same idea with Requests and Firefox.

From both Requests and Firefox I suggest to diff the raw data, i.e. the byte stream in socket. Requests can use .raw property of response object to get the raw data in socket. (http://docs.python-requests.org/en/master/user/quickstart/) If the raw data from both sides are same and there are some broken codes in HTML, maybe it is due to the different auto-fixing policies of Request and browser when parsing broken html.

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...