How to set up local file references in python-jsonschema document?

后端 未结 5 1940
失恋的感觉
失恋的感觉 2021-01-02 08:28

I have a set of jsonschema compliant documents. Some documents contain references to other documents (via the $ref attribute). I do not wish to host these docum

相关标签:
5条回答
  • 2021-01-02 09:04

    Following up on the answer @chris-w provided, I wanted to do this same thing with jsonschema 3.2.0 but his answer didn't quite cover it I hope this answer helps those who are still coming to this question for help but are using a more recent version of the package.

    To extend a JSON schema using the library, do the following:

    1. Create the base schema:
    base.schema.json
    {
      "$id": "base.schema.json",
      "type": "object",
      "properties": {
        "prop": {
          "type": "string"
        }
      },
      "required": ["prop"]
    }
    
    1. Create the extension schema
    extend.schema.json
    {
      "allOf": [
        {"$ref": "base.schema.json"},
        {
          "properties": {
            "extra": {
              "type": "boolean"
            }
          },
          "required": ["extra"]
        }
      ]
    }
    
    1. Create your JSON file you want to test against the schema
    data.json
    {
      "prop": "This is the property",
      "extra": true
    }
    
    1. Create your RefResolver and Validator for the base Schema and use it to check the data
    #Set up schema, resolver, and validator on the base schema
    baseSchema = json.loads(baseSchemaJSON) # Create a schema dictionary from the base JSON file
    relativeSchema = json.loads(relativeJSON) # Create a schema dictionary from the relative JSON file
    resolver = RefResolver.from_schema(baseSchema) # Creates your resolver, uses the "$id" element
    validator = Draft7Validator(relativeSchema, resolver=resolver) # Create a validator against the extended schema (but resolving to the base schema!)
    
    # Check validation!
    data = json.loads(dataJSON) # Create a dictionary from the data JSON file
    validator.validate(data)
    

    You may need to make a few adjustments to the above entries, such as not using the Draft7Validator. This should work for single-level references (children extending a base), you will need to be careful with your schemas and how you set up the RefResolver and Validator objects.

    P.S. Here is a snipped that exercises the above. Try modifying the data string to remove one of the required attributes:

    import json
    
    from jsonschema import RefResolver, Draft7Validator
    
    base = """
    {
      "$id": "base.schema.json",
      "type": "object",
      "properties": {
        "prop": {
          "type": "string"
        }
      },
      "required": ["prop"]
    }
    """
    
    extend = """
    {
      "allOf": [
        {"$ref": "base.schema.json"},
        {
          "properties": {
            "extra": {
              "type": "boolean"
            }
          },
          "required": ["extra"]
        }
      ]
    }
    """
    
    data = """
    {
    "prop": "This is the property string",
    "extra": true
    }
    """
    
    schema = json.loads(base)
    extendedSchema = json.loads(extend)
    resolver = RefResolver.from_schema(schema)
    validator = Draft7Validator(extendedSchema, resolver=resolver)
    
    jsonData = json.loads(data)
    validator.validate(jsonData)
    
    0 讨论(0)
  • 2021-01-02 09:18

    You must build a custom jsonschema.RefResolver for each schema which uses a relative reference and ensure that your resolver knows where on the filesystem the given schema lives.

    Such as...

    import os
    import json
    from jsonschema import Draft4Validator, RefResolver # We prefer Draft7, but jsonschema 3.0 is still in alpha as of this writing 
    
    
    abs_path_to_schema = '/path/to/schema-doc-foobar.json'
    with open(abs_path_to_schema, 'r') as fp:
      schema = json.load(fp)
    
    resolver = RefResolver(
      # The key part is here where we build a custom RefResolver 
      # and tell it where *this* schema lives in the filesystem
      # Note that `file:` is for unix systems
      schema_path='file:{}'.format(abs_path_to_schema),
      schema=schema
    )
    Draft4Validator.check_schema(schema) # Unnecessary but a good idea
    validator = Draft4Validator(schema, resolver=resolver, format_checker=None)
    
    # Then you can...
    data_to_validate = `{...}`
    validator.validate(data_to_validate)
    
    0 讨论(0)
  • 2021-01-02 09:18

    EDIT

    Fixed a wrong reference ($ref) to base schema. Updated the example to use the one from the docs: https://json-schema.org/understanding-json-schema/structuring.html

    This is just another version of @Daniel's answer -- which was the one correct for me. Basically, I decided to define the $schema in a base schema. Which then release the other schemas and makes for a clear call when instantiating the resolver.

    • The fact that RefResolver.from_schema() gets (1) some schema and also (2) a schema-store was not very clear to me whether the order and which "some" schema were relevant here. And so the structure you see below.

    I have the following:

    base.schema.json:

    {
      "$schema": "http://json-schema.org/draft-07/schema#"
    }
    

    definitions.schema.json:

    {
      "type": "object",
      "properties": {
        "street_address": { "type": "string" },
        "city":           { "type": "string" },
        "state":          { "type": "string" }
      },
      "required": ["street_address", "city", "state"]
    }
    

    address.schema.json:

    {
      "type": "object",
    
      "properties": {
        "billing_address": { "$ref": "definitions.schema.json#" },
        "shipping_address": { "$ref": "definitions.schema.json#" }
      }
    }
    

    I like this setup for two reasons:

    1. Is a cleaner call on RefResolver.from_schema():

      base = json.loads(open('base.schema.json').read())
      definitions = json.loads(open('definitions.schema.json').read())
      schema = json.loads(open('address.schema.json').read())
      
      schema_store = {
        base.get('$id','base.schema.json') : base,
        definitions.get('$id','definitions.schema.json') : definitions,
        schema.get('$id','address.schema.json') : schema,
      }
      
      resolver = RefResolver.from_schema(base, store=schema_store)
      
    2. Then I profit from the handy tool the library provides give you the best validator_for your schema (according to your $schema key):

      Validator = validator_for(base)
      
    3. And then just put them together to instantiate validator:

      validator = Validator(schema, resolver=resolver)
      

    Finally, you validate your data:

    data = {
      "shipping_address": {
        "street_address": "1600 Pennsylvania Avenue NW",
        "city": "Washington",
        "state": "DC"   
      },
      "billing_address": {
        "street_address": "1st Street SE",
        "city": "Washington",
        "state": 32
      }
    }
    
    • This one will crash since "state": 32:
    >>> validator.validate(data)
    
    ValidationError: 32 is not of type 'string'
    
    Failed validating 'type' in schema['properties']['billing_address']['properties']['state']:
        {'type': 'string'}
    
    On instance['billing_address']['state']:
        32
    

    Change that to "DC", and will validate.

    0 讨论(0)
  • 2021-01-02 09:24

    I had the hardest time figuring out how to do resolve against a set of schemas that $ref each other (I am new to JSON Schemas). It turns out the key is to create the RefResolver with a store that is a dict which maps from url to schema. Building on @devin-p's answer:

    import json
    
    from jsonschema import RefResolver, Draft7Validator
    
    base = """
    {
      "$id": "base.schema.json",
      "type": "object",
      "properties": {
        "prop": {
          "type": "string"
        }
      },
      "required": ["prop"]
    }
    """
    
    extend = """
    {  
      "$id": "extend.schema.json",
      "allOf": [
        {"$ref": "base.schema.json#"},
        {
          "properties": {
            "extra": {
              "type": "boolean"
            }
          },
        "required": ["extra"]
        }
      ]
    }
    """
    
    extend_extend = """
    {
      "$id": "extend_extend.schema.json",
      "allOf": [
        {"$ref": "extend.schema.json#"},
        {
          "properties": {
            "extra2": {
              "type": "boolean"
            }
          },
        "required": ["extra2"]
        }
      ]
    }
    """
    
    data = """
    {
    "prop": "This is the property string",
    "extra": true,
    "extra2": false
    }
    """
    
    schema = json.loads(base)
    extendedSchema = json.loads(extend)
    extendedExtendSchema = json.loads(extend_extend)
    schema_store = {
        schema['$id'] : schema,
        extendedSchema['$id'] : extendedSchema,
        extendedExtendSchema['$id'] : extendedExtendSchema,
    }
    
    
    resolver = RefResolver.from_schema(schema, store=schema_store)
    validator = Draft7Validator(extendedExtendSchema, resolver=resolver)
    
    jsonData = json.loads(data)
    validator.validate(jsonData)
    

    The above was built with jsonschema==3.2.0.

    0 讨论(0)
  • 2021-01-02 09:27

    My approach is to preload all schema fragments to RefResolver cache. I created a gist that illustrates this: https://gist.github.com/mrtj/d59812a981da17fbaa67b7de98ac3d4b

    0 讨论(0)
提交回复
热议问题