Saving multiple dataframes to multiple excel sheets multiple times?

问题

I have a function to save multiple dataframes as multiple tables to single excel workbook sheet:

def multiple_dfs(df_list, sheets, file_name, spaces):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

If I call this function multiple times to write multiple tables to multiple sheets, I end up with just one workbook and one sheet, the one that was called last:

multiple_dfs(dfs_gfk, 'GFK', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_top, 'TOP', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_all, 'Total', 'file_of_tables.xlsx', 1)

So in the end I only have file_of_tables workbook with only Total sheet. I know it's a simple problem, but somehow I just can not think of any elegant solution to this. Can anyone help?

回答1:

From the pandas.ExcelWriter documentation:

You can also append to an existing Excel file:

>>> with ExcelWriter('path_to_file.xlsx', mode='a') as writer:
...     df.to_excel(writer, sheet_name='Sheet3')

The mode keyword matters when you're creating an instance of the ExcelWriter class.

The mode='w' allocates space for the file (which it creates once you call .save() or .close()) when there isn't one or overwrites one if there is already an existing file.

The mode='a' assumes there's an existing file and appends on to that file. If you want to keep the structure of your code, you have to add a small line like so:

import pandas as pd
import os

def multiple_dfs(df_list, sheets, file_name, spaces):
    arg_mode = 'a' if file_name in os.getcwd() else 'w' # line added
    writer = pd.ExcelWriter(file_name, engine='xlsxwriter', mode=arg_mode) # added mode argument
    row = 0

    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

if you then run the following series of code(s):

multiple_dfs(dfs_gfk, 'GFK', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_top, 'TOP', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_all, 'Total', 'file_of_tables.xlsx', 1)

the last (and second function call) will not overwrite the data currently written in there. Instead what happens is that the first function call creates the file and then the second and third function call append to that data. Now, your function should work.

回答2:

Get writer outside function with with:

def multiple_dfs(df_list, sheets, writer, spaces):
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

with pd.ExcelWriter('file_of_tables.xlsx') as writer:
    multiple_dfs(dfs_gfk, 'GFK', writer, 1)
    multiple_dfs(dfs_top, 'TOP', writer, 1)
    multiple_dfs(dfs_all, 'Total', writer, 1)

来源：https://stackoverflow.com/questions/56034923/saving-multiple-dataframes-to-multiple-excel-sheets-multiple-times

标签

python

excel

pandas

function

dataframe