问题
I have some yaml with application-specific tags (from an AWS Cloud Formation template, to be exact) that looks like this:
example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"
I want to parse it so that I can do this:
>>> print(result)
>>> {'Name': 'EMR {Environment} {Purpose}'}
>>> name = result['name'].format(
... Environment='Development',
... Purpose='ETL'
... )
>>> print(name)
>>> EMR Development ETL
Currently my code looks like this:
import yaml
from pprint import pprint
def aws_join(loader, node):
join_args = loader.construct_yaml_seq(node)
delimiter = list(join_args)[0]
joinables = list(join_args)[1]
join_result = delimiter.join(joinables)
return join_result
def aws_ref(loader, node):
value = loader.construct_scalar(node)
placeholder = '{'+value+'}'
return placeholder
yaml.add_constructor('!Join', aws_join)
yaml.add_constructor('!Ref', aws_ref)
example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"
pprint(yaml.load(example_yaml))
Unfortunately this results in an error.
...
joinables = list(join_args)[1]
IndexError: list index out of range
Adding print('What I am: '+str(join_args))
to aws_join
shows that I'm getting a generator:
What I am: <generator object SafeConstructor.construct_yaml_seq at 0x1082ece08>
That's why I tried to cast the generator as a list. The generator eventually populates correctly though, just not in time for me to use it. If I change my aws_join
function to like this:
def aws_join(loader, node):
join_args = loader.construct_yaml_seq(node)
return join_args
Then the final result looks like this:
{'Name': [' ', ['EMR', '{Environment}', '{Purpose}']]}
So the required pieces to my function are there, just not when I need them in my function.
回答1:
You are close, but the problem is that you are using the method
construct_yaml_seq()
. That method is actually a registered
constructor for the normal YAML sequence (the one that eventually makes
a Python list) and it calls the construct_sequence()
method to handle the
node that gets passed in, and that is what you should do as well.
As you are returning a string, which cannot deal with recursive data
structures, you don't need to use the two step creation process (first
yield
-ing, then filling out) which the construct_yaml_seq()
method
follows. But this two step creation process is why you encountered a
generator.
construct_sequence
returns a simple list, but as you want the nodes
underneath the !Join
available when you start processing, make sure
to specify the deep=True
parameter, otherwise the second list
element will be an empty list. And because construct_yaml_seq()
,
doesn't specify deep=True
, you did not get the pieces in time in
your function (otherwise you could have actually used that method).
import yaml
from pprint import pprint
def aws_join(loader, node):
join_args = loader.construct_sequence(node, deep=True)
# you can comment out next line
assert join_args == [' ', ['EMR', '{Environment}', '{Purpose}']]
delimiter = join_args[0]
joinables = join_args[1]
return delimiter.join(joinables)
def aws_ref(loader, node):
value = loader.construct_scalar(node)
placeholder = '{'+value+'}'
return placeholder
yaml.add_constructor('!Join', aws_join, Loader=yaml.SafeLoader)
yaml.add_constructor('!Ref', aws_ref, Loader=yaml.SafeLoader)
example_yaml = "Name: !Join [' ', ['EMR', !Ref 'Environment', !Ref 'Purpose']]"
pprint(yaml.safe_load(example_yaml))
which gives:
{'Name': 'EMR {Environment} {Purpose}'}
You should not use load()
, it is documented to be potentially
unsafe, and above all: it is not necessary here. Register with the
SafeLoader
and call safe_load()
回答2:
You need to change:
def aws_join(loader, node):
delimiter = loader.construct_scalar(node.value[0])
value = loader.construct_sequence(node.value[1])
return delimiter.join(value)
Then you will get output:
{'Name': 'EMR {Environment} {Purpose}'}
来源:https://stackoverflow.com/questions/51883307/parse-nested-custom-yaml-tags