I am grabbing a Wikia page using Python requests. There\'s a problem, though: the requests request isn\'t giving me the same HTML as my browser is with the very
Maybe Requests and Browsers use different ways to render the raw data from WEB server, and the diff in the above example are only with the rendered html.
I found that when html is broken, different browsers, e.g. Chrome and Safari, use different ways to fix when parsing. So maybe it is the same idea with Requests and Firefox.
From both Requests and Firefox I suggest to diff the raw data, i.e. the byte stream in socket. Requests can use .raw property of response object to get the raw data in socket. (http://docs.python-requests.org/en/master/user/quickstart/) If the raw data from both sides are same and there are some broken codes in HTML, maybe it is due to the different auto-fixing policies of Request and browser when parsing broken html.