问题
I am using pandas to normalize some json data. I am getting stuck on this issue when more than 1 section is either an object or an array.
If i use the record_path on Car it breaks on the second.
Any pointers on how to get something like this to create a line in the csv per Car and per Location?
[
{
"Name": "John Doe",
"Car": [
"Car1",
"Car2"
],
"Location": "Texas"
},
{
"Name": "Jane Roe",
"Car": "Car1",
"Location": [
"Illinois",
"Kansas"
]
}
]
Here is the output
Name,Car,Location
John Doe,"['Car1', 'Car2']",Texas
Jane Roe,Car1,"['Illinois', 'Kansas']"
Here is the code:
with open('file.json') as data_file:
data = json.load(data_file)
df = pd.io.json.json_normalize(data, errors='ignore')
Would like it to end up like this:
Name,Car,Location
John Doe,Car1,Texas
John Doe,Car2,Texas
Jane Roe,Car1,Illinois
Jane Roe,Car1,Kansas
The answers worked great until I ran into a json file with extra data. This what a file looks like with the extra values.
{
Customers:[
{
"Name": "John Doe",
"Car": [
"Car1",
"Car2"
],
"Location": "Texas",
"Repairs: {
"RepairLocations": {
"RepairsCompleted":[
"Fix1",
"Fix2"
]
}
}
},
{
"Name": "Jane Roe",
"Car": "Car1",
"Location": [
"Illinois",
"Kansas"
]
}
]
}
Here is what I am going for. I think its the most readable in this format but anything would at least should all the keys
Name,Car,Location,Repairs:RepairLocation
John Doe,Car1,Texas,RepairsCompleted:Fix1
John Doe,Car1,Texas,RepairsCompleted:Fix2
John Doe,Car2,Texas,RepairsCompleted:Fix1
John Doe,Car2,Texas,RepairsCompleted:Fix2
Jane Roe,Car1,Illinois,
Jane Roe,Car1,Kansas,
Any suggestions on getting this second part?
回答1:
You're looking for something like this:
def expand($keys):
. as $in
| reduce $keys[] as $k ( [{}];
map(. + {
($k): ($in[$k] | if type == "array" then .[] else . end)
})
) | .[];
(.[0] | keys_unsorted) as $h
| $h, (.[] | expand($h) | [.[$h[]]]) | @csv
REPL demo
回答2:
A simple jq solution which is also a bit more generic than needed here:
["Name", "Car", "Location"],
(.[]
| [.Name] + (.Car|..|scalars|[.]) + (.Location|..|scalars|[.]))
| @csv
来源:https://stackoverflow.com/questions/60674867/json-to-csv-issues