问题
I am having issues with msgpack
in python. It seems that when serialising a dict
, if the keys are strings str
, they are not unserialised properly and causing KeyError
exceptions to be raised.
Example:
>>> import msgpack
>>> d = dict()
>>> value = 1234
>>> d['key'] = value
>>> binary = msgpack.dumps(d)
>>> new_d = msgpack.loads(binary)
>>> new_d['key']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'key'
This is because the keys are not strings after calling loads()
but are unserialised to bytes
objects.
>>> d.keys()
dict_keys(['key'])
>>> new_d.keys()
dict_keys([b'key'])
It seems this is related to a unimplemented feature as mentioned in github
My question is, Is there a way to fix this issue or a work around to ensure that the same keys can be used upon deserialisation?
I would like to use msgpack
but if I cannot build a dict
object with str
keys and expect to be able to use the same key upon deserilisation, it becomes useless.
回答1:
A default encoding is set when calling dumps
or packb
:param str encoding:
| Convert unicode to bytes with this encoding. (default: 'utf-8')
but it is not set by default when calling loads
or unpackb
as seen in:
Help on built-in function unpackb in module msgpack._unpacker:
unpackb(...)
unpackb(... encoding=None, ... )
Therefore changing the encoding on the deserialisation fixes the issue, for example:
>>> d['key'] = 1234
>>> binary = msgpack.dumps(d)
>>> msgpack.loads(binary, encoding = "utf-8")
{'key': 1234}
>>> msgpack.loads(binary, encoding = "utf-8") == d
True
回答2:
Try the following:
def c_msgpackloads(bin):
new_d = msgpack.loads(bin)
new_d = {key.decode('utf-8') if isinstance(key, bytes) else key: new_d[key].decode('utf-8') if isinstance(new_d[key], bytes) else new_d[key] for key in new_d}
return new_d
It's a custom loading function that loads the dict and automatically encodes bytes
keys and values to utf-8 strings.
来源:https://stackoverflow.com/questions/48319949/msgpack-unserialising-dict-key-strings-to-bytes