urlparse.urlparse returning 3 '/' instead of 2 after scheme

岁酱吖の 提交于 2019-12-01 16:11:55

Short answer (but it's a bit tautological):

>>> urlparse.urlparse("http://www.example.com").geturl()
'http://www.example.com'

In your example code, the hostname is parsed as a path not a network location:

>>> urlparse.urlparse("www.example.com/go")
ParseResult(scheme='', netloc='', path='www.example.com/go', params='', \
    query='', fragment='')

>>> urlparse.urlparse("http://www.example.com/go")
ParseResult(scheme='http', netloc='www.example.com', path='/go', params='', \
    query='', fragment='')

If you want to use urlparse as you were intending to, the closest "correct" equivalent is to use "//www.example.com" as the urlstring. Such a urlstring is unambiguously an absolute path without a scheme, so you could then supply "http" as the default scheme. I suppose you could do this by detecting whether your URL includes the string "//" and if not, prepending "//" on the front.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!