Saving multiple dataframes to multiple excel sheets multiple times?

好久不见. 提交于 2021-01-28 12:40:07

问题


I have a function to save multiple dataframes as multiple tables to single excel workbook sheet:

def multiple_dfs(df_list, sheets, file_name, spaces):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

If I call this function multiple times to write multiple tables to multiple sheets, I end up with just one workbook and one sheet, the one that was called last:

multiple_dfs(dfs_gfk, 'GFK', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_top, 'TOP', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_all, 'Total', 'file_of_tables.xlsx', 1)

So in the end I only have file_of_tables workbook with only Total sheet. I know it's a simple problem, but somehow I just can not think of any elegant solution to this. Can anyone help?


回答1:


From the pandas.ExcelWriter documentation:

You can also append to an existing Excel file:

>>> with ExcelWriter('path_to_file.xlsx', mode='a') as writer:
...     df.to_excel(writer, sheet_name='Sheet3')

The mode keyword matters when you're creating an instance of the ExcelWriter class.

The mode='w' allocates space for the file (which it creates once you call .save() or .close()) when there isn't one or overwrites one if there is already an existing file.

The mode='a' assumes there's an existing file and appends on to that file. If you want to keep the structure of your code, you have to add a small line like so:

import pandas as pd
import os

def multiple_dfs(df_list, sheets, file_name, spaces):
    arg_mode = 'a' if file_name in os.getcwd() else 'w' # line added
    writer = pd.ExcelWriter(file_name, engine='xlsxwriter', mode=arg_mode) # added mode argument
    row = 0

    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

if you then run the following series of code(s):

multiple_dfs(dfs_gfk, 'GFK', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_top, 'TOP', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_all, 'Total', 'file_of_tables.xlsx', 1)

the last (and second function call) will not overwrite the data currently written in there. Instead what happens is that the first function call creates the file and then the second and third function call append to that data. Now, your function should work.




回答2:


Get writer outside function with with:

def multiple_dfs(df_list, sheets, writer, spaces):
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

with pd.ExcelWriter('file_of_tables.xlsx') as writer:
    multiple_dfs(dfs_gfk, 'GFK', writer, 1)
    multiple_dfs(dfs_top, 'TOP', writer, 1)
    multiple_dfs(dfs_all, 'Total', writer, 1)


来源:https://stackoverflow.com/questions/56034923/saving-multiple-dataframes-to-multiple-excel-sheets-multiple-times

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!