Validating a yaml document in python

前端未结

关注

 10  2151

One of the benefits of XML is being able to validate a document against an XSD. YAML doesn\'t have this feature, so how can I validate that the YAML document I open is in th

相关标签:

10条回答

天涯浪人

2020-12-24 05:03

I find Cerberus to be very reliable with great documentation and straightforward to use.

Here is a basic implementation example:

my_yaml.yaml:

name: 'my_name'
date: 2017-10-01
metrics:
    percentage:
    value: 87
    trend: stable

Defining the validation schema in schema.py:

{
    'name': {
        'required': True,
        'type': 'string'
    },
    'date': {
        'required': True,
        'type': 'date'
    },
    'metrics': {
        'required': True,
        'type': 'dict',
        'schema': {
            'percentage': {
                'required': True,
                'type': 'dict',
                'schema': {
                    'value': {
                        'required': True,
                        'type': 'number',
                        'min': 0,
                        'max': 100
                    },
                    'trend': {
                        'type': 'string',
                        'nullable': True,
                        'regex': '^(?i)(down|equal|up)$'
                    }
                }
            }
        }
    }
}

Using the PyYaml to load a yaml document:

import yaml
def load_doc():
    with open('./my_yaml.yaml', 'r') as stream:
        try:
            return yaml.load(stream)
        except yaml.YAMLError as exception:
            raise exception

## Now, validating the yaml file is straightforward:
from cerberus import Validator
schema = eval(open('./schema.py', 'r').read())
    v = Validator(schema)
    doc = load_doc()
    print(v.validate(doc, schema))
    print(v.errors)

Keep in mind that Cerberus is an agnostic data validation tool, which means that it can support formats other than YAML, such as JSON, XML and so on.

0 讨论(0)

不要未来只要你来

2020-12-24 05:05
You can use python's yaml lib to display message/char/line/file of your loaded file.
```
#!/usr/bin/env python

import yaml

with open("example.yaml", 'r') as stream:
    try:
        print(yaml.load(stream))
    except yaml.YAMLError as exc:
        print(exc)
```
The error message can be accessed via exc.problem

Access exc.problem_mark to get a <yaml.error.Mark> object.

This object allows you to access attributes
- name
- column
- line
Hence you can create your own pointer to the issue:
```
pm = exc.problem_mark
print("Your file {} has an issue on line {} at position {}".format(pm.name, pm.line, pm.column))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

死守一世寂寞

2020-12-24 05:06

Given that JSON and YAML are pretty similar beasts, you could make use of JSON-Schema to validate a sizable subset of YAML. Here's a code snippet (you'll need PyYAML and jsonschema installed):

from jsonschema import validate
import yaml

schema = """
type: object
properties:
  testing:
    type: array
    items:
      enum:
        - this
        - is
        - a
        - test
"""

good_instance = """
testing: ['this', 'is', 'a', 'test']
"""

validate(yaml.load(good_instance), yaml.load(schema)) # passes

# Now let's try a bad instance...

bad_instance = """
testing: ['this', 'is', 'a', 'bad', 'test']
"""

validate(yaml.load(bad_instance), yaml.load(schema))

# Fails with:
# ValidationError: 'bad' is not one of ['this', 'is', 'a', 'test']
#
# Failed validating 'enum' in schema['properties']['testing']['items']:
#     {'enum': ['this', 'is', 'a', 'test']}
#
# On instance['testing'][3]:
#     'bad'

One problem with this is that if your schema spans multiple files and you use "$ref" to reference the other files then those other files will need to be JSON, I think. But there are probably ways around that. In my own project, I'm playing with specifying the schema using JSON files whilst the instances are YAML.

0 讨论(0)

轮回少年

2020-12-24 05:10

I'm not aware of a python solution. But there is a ruby schema validator for YAML called kwalify. You should be able to access it using subprocess if you don't come across a python library.

0 讨论(0)
发布评论:

提交评论
- 加载中...
天涯浪人

2020-12-24 05:11
These look good. The yaml parser can handle the syntax erorrs, and one of these libraries can validate the data structures.
- http://pypi.python.org/pypi/voluptuous/ (I've tried this one, it is decent, if a bit sparse.)
- http://discorporate.us/projects/flatland/ (not clear how to validate files at first glance)
0 讨论(0)
发布评论:

提交评论
- 加载中...

佛祖请我去吃肉

2020-12-24 05:12

You can load YAML document as a dict and use library schema to check it:

from schema import Schema, And, Use, Optional, SchemaError
import yaml

schema = Schema(
        {
            'created': And(datetime.datetime),
            'author': And(str),
            'email': And(str),
            'description': And(str),
            Optional('tags'): And(str, lambda s: len(s) >= 0),
            'setup': And(list),
            'steps': And(list, lambda steps: all('=>' in s for s in steps), error='Steps should be array of string '
                                                                                  'and contain "=>" to separate'
                                                                                  'actions and expectations'),
            'teardown': And(list)
        }
    )

with open(filepath) as f:
   data = yaml.load(f)
   try:
       schema.validate(data)
   except SchemaError as e:
       print(e)

0 讨论(0)

1 2 下一页