问题
I'm trying to use the ruamel.yaml
library to process a Yaml document that contains duplicate keys. In this case the duplicate key happens to be a merge key <<:
.
This is the yaml file, dupe.yml
:
foo: &ref1
a: 1
bar: &ref2
b: 2
baz:
<<: *ref1
<<: *ref2
c: 3
This is my script:
import ruamel.yaml
yml = ruamel.yaml.YAML()
yml.allow_duplicate_keys = True
doc = yml.load(open('dupe.yml'))
assert doc['baz']['a'] == 1
assert doc['baz']['b'] == 2
assert doc['baz']['c'] == 3
When run, it raises this error:
Traceback (most recent call last):
File "rua.py", line 5, in <module>
yml.load(open('dupe.yml'))
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 331, in load
return constructor.get_single_data()
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 111, in get_single_data
return self.construct_document(node)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 121, in construct_document
for _dummy in generator:
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1543, in construct_yaml_map
self.construct_mapping(node, data, deep=True)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1448, in construct_mapping
value = self.construct_object(value_node, deep=deep)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 174, in construct_object
for _dummy in generator:
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1543, in construct_yaml_map
self.construct_mapping(node, data, deep=True)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1399, in construct_mapping
merge_map = self.flatten_mapping(node)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1350, in flatten_mapping
raise DuplicateKeyError(*args)
ruamel.yaml.constructor.DuplicateKeyError: while constructing a mapping
in "dupe.yml", line 8, column 3
found duplicate key "<<"
in "dupe.yml", line 9, column 3
To suppress this check see:
http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys
Duplicate keys will become an error in future releases, and are errors
by default when using the new API.
How can I make ruamel read this file without errors? The documentation says that allow_duplicate_keys = True
will make the loader tolerate duplicated keys, but it doesn't seem to work.
I'm using Python 3.7 and ruamel.yaml 0.15.90.
回答1:
That
yaml.allow_duplicate_keys = True
only works for non-merge keys in versions before 0.15.91.
In 0.15.91+ this works and the merge key assumes the value of the first instantiation of the key (like with non-merge keys), that means it works as if you had written:
baz:
<<: *ref1
c: 3
and not as if you had written:
baz:
<<: [*ref1, *ref2]
c: 3
If you need that you have to monkey-patch the flatten routine that handles the merge keys (and that affects loading of all following YAML files with double merge keys):
import sys
import ruamel.yaml
yaml_str = """\
foo: &ref1
a: 1
bar: &ref2
b: 2
baz:
<<: *ref1
<<: *ref2
c: 3
"""
def my_flatten_mapping(self, node):
def constructed(value_node):
# type: (Any) -> Any
# If the contents of a merge are defined within the
# merge marker, then they won't have been constructed
# yet. But if they were already constructed, we need to use
# the existing object.
if value_node in self.constructed_objects:
value = self.constructed_objects[value_node]
else:
value = self.construct_object(value_node, deep=False)
return value
merge_map_list = []
index = 0
while index < len(node.value):
key_node, value_node = node.value[index]
if key_node.tag == u'tag:yaml.org,2002:merge':
if merge_map_list and not self.allow_duplicate_keys: # double << key
args = [
'while constructing a mapping',
node.start_mark,
'found duplicate key "{}"'.format(key_node.value),
key_node.start_mark,
"""
To suppress this check see:
http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys
""",
"""\
Duplicate keys will become an error in future releases, and are errors
by default when using the new API.
""",
]
if self.allow_duplicate_keys is None:
warnings.warn(DuplicateKeyFutureWarning(*args))
else:
raise DuplicateKeyError(*args)
del node.value[index]
# if key/values from later merge keys have preference you need
# to insert value_node(s) at the beginning of merge_map_list
# instead of appending
if isinstance(value_node, ruamel.yaml.nodes.MappingNode):
merge_map_list.append((index, constructed(value_node)))
elif isinstance(value_node, ruamel.yaml.nodes.SequenceNode):
for subnode in value_node.value:
if not isinstance(subnode, ruamel.yaml.nodes.MappingNode):
raise ruamel.yaml.constructor.ConstructorError(
'while constructing a mapping',
node.start_mark,
'expected a mapping for merging, but found %s' % subnode.id,
subnode.start_mark,
)
merge_map_list.append((index, constructed(subnode)))
else:
raise ConstructorError(
'while constructing a mapping',
node.start_mark,
'expected a mapping or list of mappings for merging, '
'but found %s' % value_node.id,
value_node.start_mark,
)
elif key_node.tag == u'tag:yaml.org,2002:value':
key_node.tag = u'tag:yaml.org,2002:str'
index += 1
else:
index += 1
return merge_map_list
ruamel.yaml.constructor.RoundTripConstructor.flatten_mapping = my_flatten_mapping
yaml = ruamel.yaml.YAML()
yaml.allow_duplicate_keys = True
data = yaml.load(yaml_str)
for k in data['baz']:
print(k, '>', data['baz'][k])
The above gives:
c > 3
a > 1
b > 2
回答2:
After reading the library source code, I found a workaround. Setting the option to None
prevents the error.
yml.allow_duplicate_keys = None
A warning is still printed to the console, but it's not fatal and the program will continue.
来源:https://stackoverflow.com/questions/55540686/configuring-ruamel-yaml-to-allow-duplicate-keys