Load YAML as nested objects instead of dictionary in Python

前端 未结 3 755
没有蜡笔的小新
没有蜡笔的小新 2020-12-07 04:09

I have a configuration file in YAML that is currently loaded as a dictionary using yaml.safe_load. For convenience in writing my code, I\'d prefer to load it as a set of nes

相关标签:
3条回答
  • 2020-12-07 04:22

    If you annotate the root node of the YAML file with a tag, you can define Python classes deriving from YAMLObject to deal with this as described in the PyYAML documentation.

    However, if you prefer your YAML to stay clean from tags, you can construct the nested classes yourself (taken from my answer to a similar question):

    import yaml
    
    class BItem:
        def __init__(self, q, r, s):
            self.q, self.r, self.s = q, r, s
    
    class CItem:
        def __init__(self, raw):
            self.d, self.e, self.f = raw['d'], raw['e'], raw['f']
    
    class Root:
        def __init__(self, raw):
            self.a = raw['a']
            self.b = [BItem(i['q'], i['r'], i['s']) for i in raw['b']]
            self.c = CItem(raw['c'])
    
    mydict = Root(yaml.safe_load("""
    a: 1
    b:
    - q: "foo"
      r: 99
      s: 98
    - q: "bar"
      r: 97
      s: 96
    c:
      d: 7
      e: 8
      f: [9,10,11]
    """))
    

    However, this approach only works if your YAML is structured homogeneously. You gave a heterogeneous structure by having differently named fields in the list of b (q, r, s in the first item; x, y, z in the second item). I changed the YAML input to have the same field names because with different fields, this approach does not work. I am unsure whether your YAML is actually heterogeneous or you just accidentally made it so for an example. If your YAML actually is heterogeneous, accessing the items via dict access is the only viable way since then, the keys in the YAML file do not correspond to class fields; they are dynamic mapping entries.

    0 讨论(0)
  • 2020-12-07 04:32

    This can be done, relatively easily, and without changing the input file.

    Since the dict PyYAML uses is hard-coded and cannot be patched, you not only have to provide a dict-like class that behaves as you want, you also have to go through the hoops to make PyYAML use that class. I.e. change the SafeConstructor that would normally construct a dict to use that new class, incorporate that in a new Loader and use PyYAML's load to use that Loader:

    import sys
    import yaml
    
    from yaml.loader import Reader, Scanner, Parser, Composer, SafeConstructor, Resolver
    
    class MyDict(dict):
       def __getattr__(self, name):
           return self[name]
    
    class MySafeConstructor(SafeConstructor):
       def construct_yaml_map(self, node):
           data = MyDict()
           yield data
           value = self.construct_mapping(node)
           data.update(value)
    
    MySafeConstructor.add_constructor(
      u'tag:yaml.org,2002:map', MySafeConstructor.construct_yaml_map)
    
    
    class MySafeLoader(Reader, Scanner, Parser, Composer, MySafeConstructor, Resolver):
        def __init__(self, stream):
            Reader.__init__(self, stream)
            Scanner.__init__(self)
            Parser.__init__(self)
            Composer.__init__(self)
            MySafeConstructor.__init__(self)
            Resolver.__init__(self)
    
    
    yaml_str = """\
    a: 1
    b:
    - q: "foo"
      r: 99
      s: 98
    - x: "bar"
      y: 97
      z: 96
    c:
      d: 7
      e: 8
      f: [9,10,11]
    """
    
    mydict = yaml.load(yaml_str, Loader=MySafeLoader)
    
    print(mydict.b[0].r)
    

    which gives:

    99
    

    If you need to be able to handle YAML1.2 you should use ruamel.yaml (disclaimer: I am the author of that package) which makes the above slightly simpler

    import ruamel.yaml
    
    # same definitions for yaml_str, MyDict
    
    class MySafeConstructor(ruamel.yaml.constructor.SafeConstructor):
       def construct_yaml_map(self, node):
           data = MyDict()
           yield data
           value = self.construct_mapping(node)
           data.update(value)
    
    MySafeConstructor.add_constructor(
      u'tag:yaml.org,2002:map', MySafeConstructor.construct_yaml_map)
    
    
    yaml = ruamel.yaml.YAML(typ='safe')
    yaml.Constructor = MySafeConstructor
    mydict = yaml.load(yaml_str)
    
    print(mydict.b[0].r)
    

    which also gives:

    99
    

    (and if your real input is large, should load your data noticably faster)

    0 讨论(0)
  • 2020-12-07 04:45

    Found a handy library to do exactly what I need: https://github.com/Infinidat/munch

    import yaml
    from munch import Munch
    mydict = yaml.safe_load("""
    a: 1
    b:
    - q: "foo"
      r: 99
      s: 98
    - x: "bar"
      y: 97
      z: 96
    c:
      d: 7
      e: 8
      f: [9,10,11]
    """)
    mymunch = Munch(mydict)
    

    (I had to write a simple method to recursively convert all subdicts into munches but now I can navigate my data with e.g.

    >>> mymunch.b.q
    "foo"
    
    0 讨论(0)
提交回复
热议问题