Python can't parse JSON with extra trailing comma

前端 未结 6 880
梦如初夏
梦如初夏 2020-12-21 03:59

This code:

import json
s = \'{ \"key1\": \"value1\", \"key2\": \"value2\", }\'
json.loads(s)

produces this error in Python 2:

相关标签:
6条回答
  • 2020-12-21 04:12

    Another option is to parse it as YAML; YAML accepts valid JSON but also accepts all sorts of variations.

    import yaml
    s = '{ "key1": "value1", "key2": "value2", }'
    yaml.load(s)
    
    0 讨论(0)
  • 2020-12-21 04:13

    That's because an extra , is invalid according to JSON standard.

    An object is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma).

    If you really need this, you could wrap python's json parser with jsoncomment. But I would try to fix JSON in the origin.

    0 讨论(0)
  • 2020-12-21 04:15

    JSON specification doesn't allow trailing comma. The parser is throwing since it encounters invalid syntax token.

    You might be interested in using a different parser for those files, eg. a parser built for JSON5 spec which allows such syntax.

    0 讨论(0)
  • 2020-12-21 04:18

    How about use the following regex?

    s = re.sub(r",\s*}", "}", s)
    
    0 讨论(0)
  • 2020-12-21 04:25

    It could be that this data stream is JSON5, in which case there's a parser for that: https://pypi.org/project/json5/

    This situation can be alleviated by a regex substitution that looks for ", }, and replaces it with " }, allowing for any amount of whitespace between the quotes, comma and close-curly.

    >>> import re
    >>> s = '{ "key1": "value1", "key2": "value2", }'
    >>> re.sub(r"\"\s*,\s*\}", "\" }", s)
    '{ "key1": "value1", "key2": "value2" }'
    

    Giving:

    >>> import json
    >>> s2 = re.sub(r"\"\s*,\s*\}", "\" }", s)
    >>> json.loads(s2)
    {'key1': 'value1', 'key2': 'value2'}
    

    EDIT: as commented, this is not a good practice unless you are confident your JSON data contains only simple words, and this change is not corrupting the data-stream further. As I commented on the OP, the best course of action is to repair the up-stream data source. But sometimes that's not possible.

    0 讨论(0)
  • 2020-12-21 04:27

    I suspect it doesn't parse because "it's not json", but you could pre-process strings, using regular expression to replace , } with } and , ] with ]

    0 讨论(0)
提交回复
热议问题