I want to write some data from python to xlsx. I currently have it stored as JSON, but it doesn't matter what it is going out of Python. Here's what the JSON for a single article would look like:
{ 'Word Count': 50 'Key Words': { ['Blah blah blah', 'Foo', ... ] } 'Frequency': { [9, 12, ... ] } 'Proper Nouns': { ['UN', 'USA', ... ] } 'Location': 'Mordor' }
I checked out the XlsxWriter module but can't figure out how to translate hierarchical data that is not necessarily the same size (note the number of proper nouns between the two data "objects").
What I want the data to look like:
Any pointers?
As your structures can be arbitrarily nested, I would suggest using recursion to achieve this:
from collections import OrderedDict import xlsxwriter import json def json_to_excel(ws, data, row=0, col=0): if isinstance(data, list): row -= 1 for value in data: row = json_to_excel(ws, value, row+1, col) elif isinstance(data, dict): max_row = row start_row = row for key, value in data.iteritems(): row = start_row ws.write(row, col, key) row = json_to_excel(ws, value, row+1, col) max_row = max(max_row, row) col += 1 row = max_row else: ws.write(row, col, data) return row text = """ [ { "Source ID": 123, "WordCount": 50, "Key Words": ["Blah blah blah", "Foo"], "Frequency": [9, 12, 1, 2, 3], "Proper Nouns": ["UN", "USA"], "Location": "Mordor" }, { "Source ID": 124, "WordCount": 50, "Key Words": ["Blah blah blah", "Foo"], "Frequency": [9, 12, 1, 2, 3], "Proper Nouns": ["UN", "USA"], "Location": "Mordor" } ] """ data = json.loads(text, object_pairs_hook=OrderedDict) wb = xlsxwriter.Workbook("output.xlsx") ws = wb.add_worksheet() json_to_excel(ws, data) wb.close()
This would give you an output file looking like: