Why urllib2.urlopen can not open pages like “http://localhost/new-post#comment-29”?

人盡茶涼 提交于 2019-12-10 10:29:32

问题


I'm curious, how come I get 404 error running this line:

urllib2.urlopen("http://localhost/new-post#comment-29")

While everything works fine surfing http://localhost/new-post#comment-29 in any browser...

urlopen method does not parse urls with "#" in it?

Anybody knows?


回答1:


In the HTTP protocol, the fragment (from # onwards) is not sent to the server across the network: it's locally retained by the browser and used, once the server's response is fully received, to somehow "visually locate" the exact spot in the page to be shown as "current" (for example, if the returned page is in HTML, this will be done by parsing the HTML and looking for the first suitable <a> flag).

So, the procedure is: remove the fragment e.g. via urlparse.urlparse; use the rest to fetch the resource; parse it appropriately based on the server response's content-type header; then take whatever visual action your program does regarding the "current spot" on the resource, based on locating within the parsed resource the fragment you retained in the first step.



来源:https://stackoverflow.com/questions/3798422/why-urllib2-urlopen-can-not-open-pages-like-http-localhost-new-postcomment-2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!