JSON.loads() ValueError Extra Data in Python

痞子三分冷 提交于 2020-01-01 19:41:16

问题


I'm trying to read individual values from a JSON feed. Here is an example of the feed data:

{
    "sendtoken": "token1",
    "bytes_transferred": 0,
    "num_retries": 0,
    "timestamp": 1414395374,
    "queue_time": 975,
    "message": "internalerror",
    "id": "mailerX",
    "m0": {
        "binding_group": "domain.com",
        "recipient_domain": "hotmail.com",
        "recipient_local": "destination",
        "sender_domain": "domain.com",
        "binding": "mail.domain.com",
        "message_id": "C1/34-54876-D36FA645",
        "api_credential": "creds",
        "sender_local": "localstring"
    },
    "rejecting_ip": "145.5.5.5",
    "type": "alpha",
    "message_stage": 3
}
{
    "sendtoken": "token2",
    "bytes_transferred": 0,
    "num_retries": 0,
    "timestamp": 1414397568,
    "queue_time": 538,
    "message": "internal error,
    "id": "mailerX",
    "m0": {
        "binding_group": "domain.com",
        "recipient_domain": "hotmail.com",
        "recipient_local": "destination",
        "sender_domain": "domain.com",
        "binding": "mail.domain.com",
        "message_id": "C1/34-54876-D36FA645",
        "api_credential": "creds",
        "sender_local": "localstring"
    },
    "rejecting_ip": "145.5.5.5",
    "type": "alpha",
    "message_stage": 3
}

I can't share the actual URL, but the above are the first 2 of roughly 150 results that are displayed if I run

print results

before the

json.loads()

line.

My code:

import urllib2
import json

results = urllib2.urlopen(url).read()
jsondata = json.loads(results)

for row in jsondata:
     print row['sendtoken']
     print row['recipient_domain']

I'd like output like

token1
hotmail.com

for each entry.

I'm getting this error:

ValueError: Extra data: line 2 column 1 - line 133 column 1 (char 583 - 77680)

I'm far from a Python expert, and this is my first time working with JSON. I've spent quite a bit of time looking on google and Stack Overflow, but I can't find a solution that works for my specific data format.


回答1:


The problem is that your data don't form a JSON object, so you can't decode them with json.loads.


First, this appears to be a sequence of JSON objects separated by spaces. Since you won't tell us anything about where the data come from, this is really just an educated guess; hopefully whatever documentation or coworker or whatever told you about this URL told you what the format actually is. But let's assume that my educated guess is correct.

The easiest way to parse a stream of JSON objects in Python is to use the raw_decode method. Something like this:*

import json

def parse_json_stream(stream):
    decoder = json.JSONDecoder()
    while stream:
        obj, idx = decoder.raw_decode(stream)
        yield obj
        stream = stream[idx:].lstrip()

However, there's also an error in the second JSON object in the stream. Look at this part:

…
"message": "internal error,
"id": "mailerX",
…

There's a missing " after "internal error. If you fix that, then the function above will iterate two JSON objects.

Hopefully that error was caused by you trying to manually "copy and paste" data by rewriting it. If it's in your original source data, you've got a much bigger problem; you probably need to write a "broken JSON" parser from scratch that can heuristically guess at what the data were intended to be. Or, of course, get whoever's generating the source to generate it properly.


* In general, it's more efficient to use the second argument to raw_decode to pass a start index, instead of slicing off a copy of the remainder each time. But raw_decode can't handle leading whitespace. It's a little easier to just slice and strip than to write code that skips over whitespace from the given index, but if the memory and performance costs of those copies matter, you should write the more complicated code.




回答2:


That's because json.loads (and json.load) does not decode multiple json object. For example, the json file you want may be: ["a": 1, "b": 2] however exactly the structure file of the code is: ["a": 1, "b": 2]["a": 1, "b": 2]



来源:https://stackoverflow.com/questions/26620714/json-loads-valueerror-extra-data-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!