urllib2.urlopen() vs urllib.urlopen() - urllib2 throws 404 while urllib works! WHY?

前端 未结 1 1481
醉酒成梦
醉酒成梦 2020-12-29 06:41
import urllib

print urllib.urlopen(\'http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/\         


        
相关标签:
1条回答
  • 2020-12-29 07:29

    That URL does indeed result in a 404, but with lots of HTML content. urllib2 is handling it (correctly) as an error condition. You can recover the content of that site's 404 page like so:

    import urllib2
    try:
        print urllib2.urlopen('http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/').read()
    except urllib2.HTTPError, e:
        print e.code
        print e.msg
        print e.headers
        print e.fp.read()
    
    0 讨论(0)
提交回复
热议问题