Is there a way to substitute string in YAML. For example, I would like to define sub
once and use it throughout the YAML file.
sub: [\'a\', \'b
You cannot really substitute string values in YAML, as in replacing a substring of some string with another substring¹. YAML does however have a possibility to mark a node (in your case the list ['a', 'b', 'c'] with an anchor and reuse that as an alias node.
Anchors take the form &some_id
and are inserted before a node and alias nodes are specified by *some_id
(instead of a node).
This is not the same as substitution on the string level, because during parsing of the YAML file the reference can be preserved. As is the case when loading YAML in Python for any anchors on collection types (i.e. not when using anchors on scalars):
import sys
import ruamel.yaml as yaml
yaml_str = """\
sub: &sub0 [a, b, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum : *sub0
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: *sub0
"""
data1 = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
# the loaded elements point to the same list
assert data1['sub'] is data1['command']['params']['cmd1']['enum']
# change in cmd2
data1['command']['params']['cmd2']['enum'][3] = 'X'
yaml.dump(data1, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)
This will output:
sub: &sub0 [a, X, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum: *sub0
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: *sub0
please note that the original anchor name is preserved² in ruamel.yaml.
If you don't want the anchors and aliases in your output you can override the ignore_aliases
method in the RoundTripRepresenter
subclass of RoundTripDumper
(that method takes two arguments, but using lambda *args: ....
you don't have to know about that):
dumper = yaml.RoundTripDumper
dumper.ignore_aliases = lambda *args : True
yaml.dump(data1, sys.stdout, Dumper=dumper, indent=4)
Which gives:
sub: [a, X, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum: [a, X, c]
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: [a, X, c]
And this trick can be used to read in the YAML file as if you had done string substitution, by re-reading the material you dumped while ignoring the aliases:
data2 = yaml.load(yaml.dump(yaml.load(yaml_str, Loader=yaml.RoundTripLoader),
Dumper=dumper, indent=4), Loader=yaml.RoundTripLoader)
# these are lists with the same value
assert data2['sub'] == data2['command']['params']['cmd1']['enum']
# but the loaded elements do not point to the same list
assert data2['sub'] is not data2['command']['params']['cmd1']['enum']
data2['command']['params']['cmd2']['enum'][5] = 'X'
yaml.dump(data2, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)
And now only one 'b'
is changed into 'X'
:
sub: [a, b, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum: [a, b, c]
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: [a, X, c]
As indicated above this is only necessary when using anchors/aliases on collection types, not when you use it on a scalar.
¹ Since YAML can create objects, it is however possible to influence
the parser if those objects are created. This answer describes how to do that.
² Preserving the names was originally not available, but was implemented in update of ruamel.yaml