Importing wrongly concatenated JSONs in python

后端 未结 3 1034
予麋鹿
予麋鹿 2021-01-07 07:33

I\'ve a text document that has several thousand jsons strings in the form of: \"{...}{...}{...}\". This is not a valid json it self but each {...}

3条回答
  •  醉梦人生
    2021-01-07 07:50

    You can use the jq command line utility to transfer your input to json. Let's say you have the following input:

    input.txt:

    {"name":"Bob Dylan", "tags":"{Artist}{Singer}"}{"name": "Michael Jackson"}
    

    You can use jq -s, which consumes multiple json documents from input and transfers them into a single output array:

    jq -s . input.txt
    

    Gives you:

    [
      {
        "name": "Bob Dylan",
        "tags": "{Artist}{Singer}"
      },
      {
        "name": "Michael Jackson"
      }
    ]
    

    I've just realized that there are python bindings for libjq. Meaning you don't need to use the command line, you can use jq directly in python.

    https://github.com/mwilliamson/jq.py

    However, I've not tried it so far. Let me give it a try :) ...

    Update: The above library is nice, but it does not support the slurp mode so far.

提交回复
热议问题