Overriding urllib2.HTTPError or urllib.error.HTTPError and reading response HTML anyway

前端 未结 3 961
逝去的感伤
逝去的感伤 2020-12-07 16:20

I receive a \'HTTP Error 500: Internal Server Error\' response, but I still want to read the data inside the error HTML.

With Python 2.6, I normally fetch a page usi

相关标签:
3条回答
  • 2020-12-07 16:47
    alist=['http://someurl.com']
    
    def testUrl():
        errList=[]
        for URL in alist:
            try:
                urllib2.urlopen(URL)
            except urllib2.URLError, err:
                (err.reason != 200)
                errList.append(URL+" "+str(err.reason))
                return URL+" "+str(err.reason)
        return "".join(errList)
    
    testUrl()
    
    0 讨论(0)
  • 2020-12-07 16:48

    If you mean you want to read the body of the 500:

    request = urllib2.Request(url, data, headers)
    try:
            resp = urllib2.urlopen(request)
            print resp.read()
    except urllib2.HTTPError, error:
            print "ERROR: ", error.read()
    

    In your case, you don't need to build up the request. Just do

    try:
            resp = urllib2.urlopen(url)
            print resp.read()
    except urllib2.HTTPError, error:
            print "ERROR: ", error.read()
    

    so, you don't override urllib2.HTTPError, you just handle the exception.

    0 讨论(0)
  • 2020-12-07 17:00

    The HTTPError is a file-like object. You can catch it and then read its contents.

    try:
        resp = urllib2.urlopen(url)
        contents = resp.read()
    except urllib2.HTTPError, error:
        contents = error.read()
    
    0 讨论(0)
提交回复
热议问题