Dumping unicode with YAML

六月ゝ 毕业季﹏ 提交于 2020-01-24 19:49:05

问题


I'm creating yaml files from csv's that have a lot of unicode characters in them but I can't seem to get it to dump the unicode without it giving me a Decode Error.

I'm using the ruamel.yaml library.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 11: ordinal not in range(128)

I've tried parsing strings, unicode strings, encoding with "utf-8" nothing seems to work. I've seen a lot of examples that show adding a representer to solve the issue but they all seem to be using the old method for ruamel and I can't seem to find out how to do that in the newer method documented anywhere.

from ruamel.yaml import YAML

class YamlObject(YAML):
    def __init__(self):
        YAML.__init__(self)
        self.default_flow_style = False
        self.block_seq_indent = 2
        self.indent = 4
        self.allow_unicode = True

textDict = {"text": u"HELLO_WORLD©"}
textFile = "D:\\testFile.yml"
yaml = YamlObject()
yaml.dump(textDict, file(textFile, "w"))

I can unicode the entire dict and that works but it doesn't give me the format I need back.

What I need is just:

text: HELLO_WORLD©

How can I do that?


回答1:


You're missing encoding in your derived YAML object.

Try like this:

class YamlObject(YAML):
    def __init__(self):
        YAML.__init__(self)
        self.default_flow_style = False
        self.block_seq_indent = 2
        self.indent = 4
        self.allow_unicode = True
        self.encoding = 'utf-8'

If you look at the definition of your base class, YAML, you'll notice that by default, encoding is undefined:

self.encoding = None

and it stays None through YAML.dump() and YAML.dump_all(). In the global dump() method, on contrary, encoding is set to a default utf-8 (in Python 2 only).

Update. This actually is a bug in ruamel.yaml for Python 2 (Thanks @Anthon).



来源:https://stackoverflow.com/questions/45281596/dumping-unicode-with-yaml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!