I have a JSON file I want to convert to a CSV file. How can I do this with Python?
I tried:
import json
import c
As mentioned in the previous answers the difficulty in converting json to csv is because a json file can contain nested dictionaries and therefore be a multidimensional data structure verses a csv which is a 2D data structure. However, a good way to turn a multidimensional structure to a csv is to have multiple csvs that tie together with primary keys.
In your example, the first csv output has the columns "pk","model","fields" as your columns. Values for "pk", and "model" are easy to get but because the "fields" column contains a dictionary, it should be its own csv and because "codename" appears to the be the primary key, you can use as the input for "fields" to complete the first csv. The second csv contains the dictionary from the "fields" column with codename as the the primary key that can be used to tie the 2 csvs together.
Here is a solution for your json file which converts a nested dictionaries to 2 csvs.
import csv
import json
def readAndWrite(inputFileName, primaryKey=""):
input = open(inputFileName+".json")
data = json.load(input)
input.close()
header = set()
if primaryKey != "":
outputFileName = inputFileName+"-"+primaryKey
if inputFileName == "data":
for i in data:
for j in i["fields"].keys():
if j not in header:
header.add(j)
else:
outputFileName = inputFileName
for i in data:
for j in i.keys():
if j not in header:
header.add(j)
with open(outputFileName+".csv", 'wb') as output_file:
fieldnames = list(header)
writer = csv.DictWriter(output_file, fieldnames, delimiter=',', quotechar='"')
writer.writeheader()
for x in data:
row_value = {}
if primaryKey == "":
for y in x.keys():
yValue = x.get(y)
if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
row_value[y] = str(yValue).encode('utf8')
elif type(yValue) != dict:
row_value[y] = yValue.encode('utf8')
else:
if inputFileName == "data":
row_value[y] = yValue["codename"].encode('utf8')
readAndWrite(inputFileName, primaryKey="codename")
writer.writerow(row_value)
elif primaryKey == "codename":
for y in x["fields"].keys():
yValue = x["fields"].get(y)
if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
row_value[y] = str(yValue).encode('utf8')
elif type(yValue) != dict:
row_value[y] = yValue.encode('utf8')
writer.writerow(row_value)
readAndWrite("data")