Python requests isn't giving me the same HTML as my browser is

前端 未结 6 1001
失恋的感觉
失恋的感觉 2021-01-31 18:56

I am grabbing a Wikia page using Python requests. There\'s a problem, though: the requests request isn\'t giving me the same HTML as my browser is with the very

6条回答
  •  孤街浪徒
    2021-01-31 19:19

    Maybe Requests and Browsers use different ways to render the raw data from WEB server, and the diff in the above example are only with the rendered html.

    I found that when html is broken, different browsers, e.g. Chrome and Safari, use different ways to fix when parsing. So maybe it is the same idea with Requests and Firefox.

    From both Requests and Firefox I suggest to diff the raw data, i.e. the byte stream in socket. Requests can use .raw property of response object to get the raw data in socket. (http://docs.python-requests.org/en/master/user/quickstart/) If the raw data from both sides are same and there are some broken codes in HTML, maybe it is due to the different auto-fixing policies of Request and browser when parsing broken html.

提交回复
热议问题