问题
I need to get the line numbers of certain keys of a YAML file.
Please note, this answer does not solve the issue: I do use ruamel.yaml, and the answers do not work with ordered maps.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from ruamel import yaml
data = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")
print(data)
As a result I get this:
CommentedMap([('key1', CommentedOrderedMap([('key2', 'item2'), ('key3', 'item3'), ('key4', CommentedOrderedMap([('key5', 'item5'), ('key6', 'item6')]))]))])
what does not allow to access to the line numbers, except for the !!omap
keys:
print(data['key1'].lc.line) # output: 1
print(data['key1']['key4'].lc.line) # output: 4
but:
print(data['key1']['key2'].lc.line) # output: AttributeError: 'str' object has no attribute 'lc'
Indeed, data['key1']['key2]
is a str
.
I've found a workaround:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from ruamel import yaml
DATA = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")
def get_line_nb(data):
if isinstance(data, dict):
offset = data.lc.line
for i, key in enumerate(data):
if isinstance(data[key], dict):
get_line_nb(data[key])
else:
print('{}|{} found in line {}\n'
.format(key, data[key], offset + i + 1))
get_line_nb(DATA)
output:
key2|item2 found in line 2
key3|item3 found in line 3
key5|item5 found in line 5
key6|item6 found in line 6
but this looks a little bit "dirty". Is there a more proper way of doing it?
EDIT: this workaround is not only dirty, but only works for simple cases like the one above, and will give wrong results as soon as there are nested lists in the way
回答1:
This issue is not that you are using !omap
and that it doesn't give you the line-numbers as with "normal" mappings. That should be clear from the fact that you get 4 from doing print(data['key1']['key4'].lc.line)
(where key4
is a key in the outer !omap
).
As this answers indicates,
you can access the property lc on collection items
The value for data['key1']['key4']
is a collection item (another !omap
), but the value for data['key1']['key2']
is not a collection item but a, built-in, python string, which has no slot to store the lc
attribute.
To get an .lc
attribute on a non-collection like a string you have to subclass the RoundTripConstructor
, to use something like the classes in scalarstring.py
(with __slots__
adjusted to accept the lc
attribute and then transfer the line information available in the nodes to that attribute and then set the line, column information:
import sys
import ruamel.yaml
yaml_str = """
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: 'item5'
- key6: |
item6
"""
class Str(ruamel.yaml.scalarstring.ScalarString):
__slots__ = ('lc')
style = ""
def __new__(cls, value):
return ruamel.yaml.scalarstring.ScalarString.__new__(cls, value)
class MyPreservedScalarString(ruamel.yaml.scalarstring.PreservedScalarString):
__slots__ = ('lc')
class MyDoubleQuotedScalarString(ruamel.yaml.scalarstring.DoubleQuotedScalarString):
__slots__ = ('lc')
class MySingleQuotedScalarString(ruamel.yaml.scalarstring.SingleQuotedScalarString):
__slots__ = ('lc')
class MyConstructor(ruamel.yaml.constructor.RoundTripConstructor):
def construct_scalar(self, node):
# type: (Any) -> Any
if not isinstance(node, ruamel.yaml.nodes.ScalarNode):
raise ruamel.yaml.constructor.ConstructorError(
None, None,
"expected a scalar node, but found %s" % node.id,
node.start_mark)
if node.style == '|' and isinstance(node.value, ruamel.yaml.compat.text_type):
ret_val = MyPreservedScalarString(node.value)
elif bool(self._preserve_quotes) and isinstance(node.value, ruamel.yaml.compat.text_type):
if node.style == "'":
ret_val = MySingleQuotedScalarString(node.value)
elif node.style == '"':
ret_val = MyDoubleQuotedScalarString(node.value)
else:
ret_val = Str(node.value)
else:
ret_val = Str(node.value)
ret_val.lc = ruamel.yaml.comments.LineCol()
ret_val.lc.line = node.start_mark.line
ret_val.lc.col = node.start_mark.column
return ret_val
yaml = ruamel.yaml.YAML()
yaml.Constructor = MyConstructor
data = yaml.load(yaml_str)
print(data['key1']['key4'].lc.line)
print(data['key1']['key2'].lc.line)
print(data['key1']['key4']['key6'].lc.line)
Please note that the output of the last call to print
is 6, as the literal scalar string starts with the |
.
If you also want to dump data
, you'll need to make a Representer
aware of those My....
types.
来源:https://stackoverflow.com/questions/45716281/parsing-yaml-get-line-numbers-even-in-ordered-maps