Convert JSON to Excel by Python

不羁岁月 提交于 2021-02-11 15:46:37

问题


I have a JSON that need to convert to Excel. I'm using Python 3.8 with xlsxwriter library. Below is sample JSON.

{
    "companyId": "123456",
    "companyName": "Test",
    "companyStatus": "ACTIVE",
    "document": {
        "employee": {
            "employeeId": "EM1567",
            "employeeLastName": "Test Last",
            "employeeFirstName": "Test Fist"
        },
        "expenseEntry": [
            {
            "allocation": [
                {
                "allocationId": "03B249B3598",
                "journal": [
                    {
                        "journalAccountCode": "888",
                        "journalPayee": "EMPL",
                        "journalPayer": "COMP",
                        "taxGuid": [
                            "51645A638114E"
                        ]
                    },
                    {
                        "journalAccountCode": "999",
                        "journalPayee": "EMPL",
                        "journalPayer": "EMPL",
                        "taxGuid": [
                            "8114E51645A63"
                        ]
                    },
                ],
                "tax": [
                    {
                        "taxCode": "TAX123",
                        "taxSource": "SYST"
                    },
                    {
                        "taxCode": "TAX456",
                        "taxSource": "SYST"
                    }
                ]
                }
            ],
            "approvedAmount": 200.0,
            "entryDate": "2020-12-10",
            "entryId": "ENTRY9988"
            }
        ],
        "report": {
            "currencyCode": "USD",
            "reportCreationDate": "2020-12-10",
            "reportId": "ACA849BBB",
            "reportName": "Test Report",
            "totalApprovedAmount": 200.0
        }
    },
    "id": "c71b7d756f549"
}

And my current code: https://repl.it/@tonyiscoming/jsontoexcel

I tried with pandas

import pandas as pd

df = pd.json_normalize(data, max_level=5)
df.to_excel('test.xlsx', index=False)

And got the result

I tried with json_excel_converter

from json_excel_converter import Converter 
from json_excel_converter.xlsx import Writer

conv = Converter()
conv.convert(data, Writer(file='test.xlsx'))

And got the result

This is my expectation

Would anyone please help me in this case? Thank you so much.


回答1:


Here is the code what you are looking for. I did this using XlsxWriter package. First I made the template with some cell format stuff. After that, I entered values using according to your JSON.

import xlsxwriter
from itertools import zip_longest

data = [
    {
        "companyId": "123456",
        "companyName": "Test",
        "companyStatus": "ACTIVE",
        "document": {
            "employee": {
                "employeeId": "EM1567",
                "employeeLastName": "Test Last",
                "employeeFirstName": "Test Fist"
            },
            "expenseEntry": [
                {
                    "allocation": [
                        {
                            "allocationId": "03B249B3598",
                            "journal": [
                                {
                                    "journalAccountCode": "888",
                                    "journalPayee": "EMPL",
                                    "journalPayer": "COMP",
                                    "taxGuid": [
                                        "51645A638114E"
                                    ]
                                },
                                {
                                    "journalAccountCode": "999",
                                    "journalPayee": "EMPL",
                                    "journalPayer": "EMPL",
                                    "taxGuid": [
                                        "8114E51645A63"
                                    ]
                                },
                            ],
                            "tax": [
                                {
                                    "taxCode": "TAX123",
                                    "taxSource": "SYST"
                                },
                                {
                                    "taxCode": "TAX456",
                                    "taxSource": "SYST"
                                }
                            ]
                        }
                    ],
                    "approvedAmount": 200.0,
                    "entryDate": "2020-12-10",
                    "entryId": "ENTRY9988"
                }
            ],
            "report": {
                "currencyCode": "USD",
                "reportCreationDate": "2020-12-10",
                "reportId": "ACA849BBB",
                "reportName": "Test Report",
                "totalApprovedAmount": 200.0
            }
        },
        "id": "c71b7d756f549"
    }
]

xlsx_file = 'your_file_name_here.xlsx'

# define the excel file
workbook = xlsxwriter.Workbook(xlsx_file)

# create a sheet for our work, defaults to Sheet1.
worksheet = workbook.add_worksheet()

# common merge format
merge_format = workbook.add_format({'align': 'center', 'valign': 'vcenter'})

# set all column width to 20
worksheet.set_column('A:V', 20)

# column wise template creation (A-V)
worksheet.merge_range(0, 0, 4, 0, 'companyId', merge_format)  # A
worksheet.merge_range(0, 1, 4, 1, 'companyName', merge_format)  # B
worksheet.merge_range(0, 2, 4, 2, 'companyStatus', merge_format)  # C

worksheet.merge_range(0, 3, 0, 20, 'document', merge_format)  # C-U

worksheet.merge_range(1, 3, 1, 5, 'employee', merge_format)  # D-F
worksheet.merge_range(2, 3, 4, 3, 'employeeId', merge_format)  # D
worksheet.merge_range(2, 4, 4, 4, 'employeeLastName', merge_format)  # E
worksheet.merge_range(2, 5, 4, 5, 'employeeFirstName', merge_format)  # F

worksheet.merge_range(1, 6, 1, 15, 'expenseEntry', merge_format)  # G-P
worksheet.merge_range(2, 6, 2, 12, 'allocation', merge_format)  # G-M
worksheet.merge_range(3, 6, 4, 6, 'allocationId', merge_format)  # G

worksheet.merge_range(3, 7, 3, 10, 'journal', merge_format)  # H-K
worksheet.write(4, 7, 'journalAccountCode')  # H
worksheet.write(4, 8, 'journalPayee')  # I
worksheet.write(4, 9, 'journalPayer')  # J
worksheet.write(4, 10, 'taxGuid')  # K

worksheet.merge_range(3, 11, 3, 12, 'tax', merge_format)  # L-M
worksheet.write(4, 11, 'taxCode')  # L
worksheet.write(4, 12, 'taxSource')  # M

worksheet.merge_range(2, 13, 4, 13, 'approvedAmount', merge_format)  # N
worksheet.merge_range(2, 14, 4, 14, 'entryDate', merge_format)  # O
worksheet.merge_range(2, 15, 4, 15, 'entryId', merge_format)  # P

worksheet.merge_range(1, 16, 1, 20, 'report', merge_format)  # Q-U
worksheet.merge_range(2, 16, 4, 16, 'currencyCode', merge_format)  # Q
worksheet.merge_range(2, 17, 4, 17, 'reportCreationDate', merge_format)  # R
worksheet.merge_range(2, 18, 4, 18, 'reportId', merge_format)  # S
worksheet.merge_range(2, 19, 4, 19, 'reportName', merge_format)  # T
worksheet.merge_range(2, 20, 4, 20, 'totalApprovedAmount', merge_format)  # U

worksheet.merge_range(0, 21, 4, 21, 'id', merge_format)  # V

# inserting data
row = 5
for obj in data:
    worksheet.write(row, 0, obj.get('companyId'))
    worksheet.write(row, 1, obj.get('companyName'))
    worksheet.write(row, 2, obj.get('companyStatus'))

    document = obj.get('document', {})

    # employee details
    employee = document.get('employee', {})
    worksheet.write(row, 3, employee.get('employeeId'))
    worksheet.write(row, 4, employee.get('employeeLastName'))
    worksheet.write(row, 5, employee.get('employeeFirstName'))

    # report details
    report = document.get('report', {})
    worksheet.write(row, 16, report.get('currencyCode'))
    worksheet.write(row, 17, report.get('reportCreationDate'))
    worksheet.write(row, 18, report.get('reportId'))
    worksheet.write(row, 19, report.get('reportName'))
    worksheet.write(row, 20, report.get('totalApprovedAmount'))

    worksheet.write(row, 21, obj.get('id'))

    # expenseEntry details
    expense_entries = document.get('expenseEntry', [])
    for expense_entry in expense_entries:
        worksheet.write(row, 13, expense_entry.get('approvedAmount'))
        worksheet.write(row, 14, expense_entry.get('entryDate'))
        worksheet.write(row, 15, expense_entry.get('entryId'))

        # allocation details
        allocations = expense_entry.get('allocation', [])
        for allocation in allocations:
            worksheet.write(row, 6, allocation.get('allocationId'))

            # journal and tax details
            journals = allocation.get('journal', [])
            taxes = allocation.get('tax', [])
            for journal_and_tax in list(zip_longest(journals, taxes)):
                journal, tax = journal_and_tax
                worksheet.write(row, 7, journal.get('journalAccountCode'))
                worksheet.write(row, 8, journal.get('journalPayee'))
                worksheet.write(row, 9, journal.get('journalPayer'))
                worksheet.write(row, 11, tax.get('taxCode'))
                worksheet.write(row, 12, tax.get('taxSource'))

                # taxGuid details
                tax_guides = journal.get('taxGuid', [])
                if not tax_guides:
                    row = row + 1
                    continue

                for tax_guide in tax_guides:
                    worksheet.write(row, 10, tax_guide)
                    row = row + 1

# finally close the created excel file
workbook.close()

One thing, instead of creating a template in the script you can make your own one and save it somewhere else. Then get the copy of that template and just add data using the script. This will give you a chance to make your own base template, otherwise, you have to format your excel using the script, such as border formattings, merge cells, etc.

I used zip_longest python built-in function from itertools to zip journal and tax objects. Just follow Python – Itertools.zip_longest() or Python's zip_longest Function article for examples. If you didn't understand anything from my code, please comment below.




回答2:


Having empty cells in an Excel Grid is not something really "propper", which is why json_excel_converter beahaves like this. So, If you want to achieve this, I'm afraid you'll have to develop it all by yourself.



来源:https://stackoverflow.com/questions/65299781/convert-json-to-excel-by-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!