Cannot read urllib error message once it is read()

对着背影说爱祢 提交于 2019-12-13 01:28:00

问题


My problem is with error handling of the python urllib error object. I am unable to read the error message while still keeping it intact in the error object, for it to be consumed later.

response = urllib.request.urlopen(request) # request that will raise an error
response.read()
response.read() # is empty now
# Also tried seek(0), that does not work either.

So this how I intend to use it, but when the Exception bubbles up, the.read() second time is empty.

try:
    response = urllib.request.urlopen(request)
except urllib.error.HTTPError as err:
    self.log.exception(err.read())
    raise err

I tried making a deepcopy of the err object,

import copy
try:
    response = urllib.request.urlopen(request)
except urllib.error.HTTPError as err:
    err_obj_copy = copy.deepcopy(err)
    self.log.exception(
        "Method:{}\n"
        "URL:{}\n"
        "Data:{}\n"
        "Details:{}\n"
        "Headers:{}".format(method, url, data, err_obj_copy.read(), headers))
    raise err

but copy is unable to make a deepcopy and throws an error - TypeError: __init__() missing 5 required positional arguments: 'url', 'code', 'msg', 'hdrs', and 'fp'.

How do I read the error message, while still keeping it intact in the object?

I do know how to do it using requests, but I am stuck with legacy code and need to make it work with urllib


回答1:


This is what I did. Worked for me.

When reading the error for the first time, save it to a variable like this: msg = response.read().decode('utf8'). You can then create a new HTTPError instance, with the message, and propagate it.

resp = urllib.request.urlopen(request)
msg = resp.read().decode('utf8')
self.log.exception(msg)
raise HTTPError(resp.url, resp.code, resp.reason, resp.headers, io.BytesIO(bytes(msg, 'utf8')))



回答2:


The error object may read from the network. Network is not seekable -- you can't go back in the general case.

You could replace err with a new HTTPError instance that reads from a buffer (like io.BytesIO()) instead of the network e.g., (not tested):

content = err.read()
self.log.exception(content)
raise HTTPError(err.url, err.code, err.reason, err.headers, io.BytesIO(content))

Though I'm not sure that you should -- handle the error in a single place instead e.g., reraise a more application specific exception or leave the logging to an upstream handler.



来源:https://stackoverflow.com/questions/33660178/cannot-read-urllib-error-message-once-it-is-read

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!