I have a configuration file in YAML that is currently loaded as a dictionary using yaml.safe_load. For convenience in writing my code, I\'d prefer to load it as a set of nes
If you annotate the root node of the YAML file with a tag, you can define Python classes deriving from YAMLObject
to deal with this as described in the PyYAML documentation.
However, if you prefer your YAML to stay clean from tags, you can construct the nested classes yourself (taken from my answer to a similar question):
import yaml
class BItem:
def __init__(self, q, r, s):
self.q, self.r, self.s = q, r, s
class CItem:
def __init__(self, raw):
self.d, self.e, self.f = raw['d'], raw['e'], raw['f']
class Root:
def __init__(self, raw):
self.a = raw['a']
self.b = [BItem(i['q'], i['r'], i['s']) for i in raw['b']]
self.c = CItem(raw['c'])
mydict = Root(yaml.safe_load("""
a: 1
b:
- q: "foo"
r: 99
s: 98
- q: "bar"
r: 97
s: 96
c:
d: 7
e: 8
f: [9,10,11]
"""))
However, this approach only works if your YAML is structured homogeneously. You gave a heterogeneous structure by having differently named fields in the list of b
(q
, r
, s
in the first item; x
, y
, z
in the second item). I changed the YAML input to have the same field names because with different fields, this approach does not work. I am unsure whether your YAML is actually heterogeneous or you just accidentally made it so for an example. If your YAML actually is heterogeneous, accessing the items via dict access is the only viable way since then, the keys in the YAML file do not correspond to class fields; they are dynamic mapping entries.
This can be done, relatively easily, and without changing the input file.
Since the
dict
PyYAML uses is hard-coded and cannot be patched, you not only have to provide
a dict-like class that behaves as you want, you also have to go through the hoops to make
PyYAML use that class. I.e. change the SafeConstructor
that would normally construct a dict
to use that new class, incorporate that in a new Loader and use PyYAML's load
to use that Loader:
import sys
import yaml
from yaml.loader import Reader, Scanner, Parser, Composer, SafeConstructor, Resolver
class MyDict(dict):
def __getattr__(self, name):
return self[name]
class MySafeConstructor(SafeConstructor):
def construct_yaml_map(self, node):
data = MyDict()
yield data
value = self.construct_mapping(node)
data.update(value)
MySafeConstructor.add_constructor(
u'tag:yaml.org,2002:map', MySafeConstructor.construct_yaml_map)
class MySafeLoader(Reader, Scanner, Parser, Composer, MySafeConstructor, Resolver):
def __init__(self, stream):
Reader.__init__(self, stream)
Scanner.__init__(self)
Parser.__init__(self)
Composer.__init__(self)
MySafeConstructor.__init__(self)
Resolver.__init__(self)
yaml_str = """\
a: 1
b:
- q: "foo"
r: 99
s: 98
- x: "bar"
y: 97
z: 96
c:
d: 7
e: 8
f: [9,10,11]
"""
mydict = yaml.load(yaml_str, Loader=MySafeLoader)
print(mydict.b[0].r)
which gives:
99
If you need to be able to handle YAML1.2 you should use ruamel.yaml (disclaimer: I am the author of that package) which makes the above slightly simpler
import ruamel.yaml
# same definitions for yaml_str, MyDict
class MySafeConstructor(ruamel.yaml.constructor.SafeConstructor):
def construct_yaml_map(self, node):
data = MyDict()
yield data
value = self.construct_mapping(node)
data.update(value)
MySafeConstructor.add_constructor(
u'tag:yaml.org,2002:map', MySafeConstructor.construct_yaml_map)
yaml = ruamel.yaml.YAML(typ='safe')
yaml.Constructor = MySafeConstructor
mydict = yaml.load(yaml_str)
print(mydict.b[0].r)
which also gives:
99
(and if your real input is large, should load your data noticably faster)
Found a handy library to do exactly what I need: https://github.com/Infinidat/munch
import yaml
from munch import Munch
mydict = yaml.safe_load("""
a: 1
b:
- q: "foo"
r: 99
s: 98
- x: "bar"
y: 97
z: 96
c:
d: 7
e: 8
f: [9,10,11]
""")
mymunch = Munch(mydict)
(I had to write a simple method to recursively convert all subdicts into munches but now I can navigate my data with e.g.
>>> mymunch.b.q
"foo"