Serializing namedtuples via PyYAML

柔情痞子 提交于 2019-12-12 09:58:31

问题


I'm looking for some reasonable way to serialize namedtuples in YAML using PyYAML.

A few things I don't want to do:

  • Rely on a dynamic call to add a constructor/representor/resolver upon instantiation of the namedtuple. These YAML files may be stored and re-loaded later, so I cannot rely on the same runtime environment existing when they are restored.

  • Register the namedtuples in global.

  • Rely on the namedtuples having unique names

I was thinking of something along these lines:

class namedtuple(object):
    def __new__(cls, *args, **kwargs):
        x = collections.namedtuple(*args, **kwargs)

        class New(x):
            def __getstate__(self):
                return {
                    "name": self.__class__.__name__,
                    "_fields": self._fields,
                    "values": self._asdict().values()
                }
        return New

def namedtuple_constructor(loader, node):
    import IPython; IPython.embed()
    value = loader.construct_scalar(node)

import re
pattern = re.compile(r'!!python/object/new:myapp.util\.')
yaml.add_implicit_resolver(u'!!myapp.util.namedtuple', pattern)
yaml.add_constructor(u'!!myapp.util.namedtuple', namedtuple_constructor)

Assuming this was in an application module at the path myapp/util.py

I'm not getting into the constructor, however, when I try to load:

from myapp.util import namedtuple

x = namedtuple('test', ['a', 'b'])
t = x(1,2)
dump = yaml.dump(t)
load = yaml.load(dump)

It will fail to find New in myapp.util.

I tried a variety of other approaches as well, this was just one that I thought might work best.

Disclaimer: Even once I get into the proper constructor I'm aware my spec will need further work regarding what arguments get saved how they are passed into the resulting object, but the first step for me is to get the YAML representation into my constructor function, then the rest should be easy.


回答1:


I was able to solve my problem, though in a slightly less than ideal way.

My application now uses its own namedtuple implementation; I copied the collections.namedtuple source, created a base class for all new namedtuple types to inherit, and modified the template (excerpts below for brevity, simply highlighting whats change from the namedtuple source).

class namedtupleBase(tuple): 
    pass

_class_template = '''\
class {typename}(namedtupleBase):
    '{typename}({arg_list})'

One little change to the namedtuple function itself to add the new class into the namespace:

namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename,
                 OrderedDict=OrderedDict, _property=property, _tuple=tuple,
                 namedtupleBase=namedtupleBase)

Now registering a multi_representer solves the problem:

def repr_namedtuples(dumper, data):
    return dumper.represent_mapping(u"!namedtupleBase", {
        "__name__": data.__class__.__name__,
        "__dict__": collections.OrderedDict(
            [(k, v) for k, v in data._asdict().items()])
    })

def consruct_namedtuples(loader, node):
    value = loader.construct_mapping(node)
    cls_ = namedtuple(value['__name__'], value['__dict__'].keys())
    return cls_(*value['__dict__'].values())

yaml.add_multi_representer(namedtupleBase, repr_namedtuples)
yaml.add_constructor("!namedtupleBase", consruct_namedtuples)

Hattip to Represent instance of different classes with the same base class in pyyaml for the inspiration behind the solution.

Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.




回答2:


Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.

Here you go.

TL;DR

Proof of concept using PyAML 3.12.

import yaml

def named_tuple(self, data):
    if hasattr(data, '_asdict'):
        return self.represent_dict(data._asdict())
    return self.represent_list(data)

yaml.SafeDumper.yaml_multi_representers[tuple] = named_tuple

Note: To be clean you should use one of the add_multi_representer() methods at your disposition and a custom representer/loader, like you did.

This gives you:

>>> import collections
>>> Foo = collections.namedtuple('Foo', 'x y z')
>>> yaml.safe_dump({'foo': Foo(1,2,3), 'bar':(4,5,6)})
'bar: [4, 5, 6]\nfoo: {x: 1, y: 2, z: 3}\n'
>>> print yaml.safe_dump({'foo': Foo(1,2,3), 'bar':(4,5,6)})                                                                                                   
bar: [4, 5, 6]
foo: {x: 1, y: 2, z: 3}

How does this work

As you discovered by yourself, a namedtuple does not have a special class; exploring it gives:

>>> collections.namedtuple('Bar', '').mro()
[<class '__main__.Bar'>, <type 'tuple'>, <type 'object'>]

So the instances of the Python named tuples are tuple instances with an additional _asdict() method.



来源:https://stackoverflow.com/questions/24717753/serializing-namedtuples-via-pyyaml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!