Groovy: validate JSON string

后端 未结 3 1825
天命终不由人
天命终不由人 2021-02-20 05:36

I need to check that a string is valid JSON in Groovy. My first thought was just to send it through new JsonSlurper().parseText(myString) and, if there was no exce

3条回答
  •  南方客
    南方客 (楼主)
    2021-02-20 06:22

    JsonSlurper class uses JsonParser interface implementations (with JsonParserCharArray being a default one). Those parsers check char by char what is the current character and what kind of token type it represents. If you take a look at JsonParserCharArray.decodeJsonObject() method at line 139 you will see that if parser sees } character, it breaks the loop and finishes decoding JSON object and ignores anything that exists after }.

    That's why if you put any unrecognizable character(s) in front of your JSON object, JsonSlurper will throw an exception. But if you end your JSON string with any incorrect characters after }, it will pass, because parser does not even take those characters into account.

    Solution

    You may consider using JsonOutput.prettyPrint(String json) method that is more restrict if it comes to JSON it tries to print (it uses JsonLexer to read JSON tokens in a streaming fashion). If you do:

    def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}...'
    
    JsonOutput.prettyPrint(jsonString)
    

    it will throw an exception like:

    Exception in thread "main" groovy.json.JsonException: Lexing failed on line: 1, column: 48, while reading '.', no possible valid JSON value or punctuation could be recognized.
        at groovy.json.JsonLexer.nextToken(JsonLexer.java:83)
        at groovy.json.JsonLexer.hasNext(JsonLexer.java:233)
        at groovy.json.JsonOutput.prettyPrint(JsonOutput.java:501)
        at groovy.json.JsonOutput$prettyPrint.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at app.JsonTest.main(JsonTest.groovy:13)
    

    But if we pass a valid JSON document like:

    def jsonString = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'
    
    JsonOutput.prettyPrint(jsonString)
    

    it will pass successfully.

    The good thing is that you don't need any additional dependency to validate your JSON.

    UPDATE: solution for multiple different cases

    I did some more investigation and run tests with 3 different solutions:

    • JsonOutput.prettyJson(String json)
    • JsonSlurper.parseText(String json)
    • ObjectMapper.readValue(String json, Class<> type) (it requires adding jackson-databind:2.9.3 dependency)

    I have used following JSONs as an input:

    def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
    def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
    def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
    def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
    def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'
    

    Expected result is that first 4 JSONs fail validation and only 5th one is correct. To test it out I have created this Groovy script:

    @Grab(group='com.fasterxml.jackson.core', module='jackson-databind', version='2.9.3')
    
    import groovy.json.JsonOutput
    import groovy.json.JsonSlurper
    import com.fasterxml.jackson.databind.ObjectMapper
    import com.fasterxml.jackson.databind.DeserializationFeature
    
    def json1 = '{"name": "John", "data": [{"id": 1},{"id": 2},]}'
    def json2 = '{"name": "John", "data": [{"id": 1},{"id": 2}],}'
    def json3 = '{"name": "John", "data": [{"id": 1},{"id": 2}]},'
    def json4 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}... abc'
    def json5 = '{"name": "John", "data": [{"id": 1},{"id": 2}]}'
    
    def test1 = { String json ->
        try {
            JsonOutput.prettyPrint(json)
            return "VALID"
        } catch (ignored) {
            return "INVALID"
        }
    }
    
    def test2 = { String json ->
        try {
            new JsonSlurper().parseText(json)
            return "VALID"
        } catch (ignored) {
            return "INVALID"
        }
    }
    
    ObjectMapper mapper = new ObjectMapper()
    mapper.configure(DeserializationFeature.FAIL_ON_TRAILING_TOKENS, true)
    
    def test3 = { String json ->
        try {
            mapper.readValue(json, Map)
            return "VALID"
        } catch (ignored) {
            return "INVALID"
        }
    }
    
    def jsons = [json1, json2, json3, json4, json5]
    def tests = ['JsonOutput': test1, 'JsonSlurper': test2, 'ObjectMapper': test3]
    
    def result = tests.collectEntries { name, test ->
        [(name): jsons.collect { json ->
            [json: json, status: test(json)]
        }]
    }
    
    result.each {
        println "${it.key}:"
        it.value.each {
            println " ${it.status}: ${it.json}"
        }
        println ""
    }
    

    And here is the result:

    JsonOutput:
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
    
    JsonSlurper:
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
    
    ObjectMapper:
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
    

    As you can see the winner is Jackson's ObjectMapper.readValue() method. What's important - it works with jackson-databind >= 2.9.0. In this version they introduced DeserializationFeature.FAIL_ON_TRAILING_TOKENS which makes JSON parser working as expected. If we wont set this configuration feature to true as in the above script, ObjectMapper produces incorrect result:

    ObjectMapper:
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2},]}
     INVALID: {"name": "John", "data": [{"id": 1},{"id": 2}],}
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]},
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}... abc
     VALID: {"name": "John", "data": [{"id": 1},{"id": 2}]}
    

    I was surprised that Groovy's standard library fails in this test. Luckily it can be done with jackson-databind:2.9.x dependency. Hope it helps.

提交回复
热议问题