问题
I have the following "stacked JSON" object within R, example1.json
:
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
"Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
"Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
"Code":[{"event1":"B","result":"0"},…]}
These are not comma-separated. The fundamental goal would be to parse certain fields (or all fields) into an R data.frame or data.table:
Timestamp Usefulness
0 20140101 Yes
1 20140102 No
2 20140103 No
Normally, I would read in a JSON within R as follows:
library(jsonlite)
jsonfile = "example1.json"
foobar = fromJSON(jsonfile)
This however throws a parsing error:
Error: lexical error: invalid char in json text.
[{"event1":"A","result":"1"},…]} {"ID":"1A35B","Timestamp"
(right here) ------^
This is a similar question to the following, but in R: multiple Json objects in one file extract by python
EDIT: This file format is called a "newline delimited JSON", NDJSON.
回答1:
The three dots
...
invalidate your JSON, hence yourlexical error
.You can use
jsonlite::stream_in()
to 'stream in' lines of JSON.
library(jsonlite)
jsonlite::stream_in(file("~/Desktop/examples1.json"))
# opening file input connection.
# Imported 3 records. Simplifying...
# closing file input connection.
# ID Timestamp Usefulness Code
# 1 12345 20140101 Yes A, 1
# 2 1A35B 20140102 No B, 1
# 3 AA356 20140103 No B, 0
Data
I've cleaned your example data to make it valid JSON and saved it to my desktop as ~/Desktop/examples1.json
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes","Code":[{"event1":"A","result":"1"}]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No","Code":[{"event1":"B","result":"1"}]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No","Code":[{"event1":"B","result":"0"}]}
来源:https://stackoverflow.com/questions/50430510/how-to-parse-a-file-with-stacked-multiple-jsons-in-r