问题
I have a function to save multiple dataframes as multiple tables to single excel workbook sheet:
def multiple_dfs(df_list, sheets, file_name, spaces):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)
row = row + len(dataframe.index) + spaces + 1
writer.save()
If I call this function multiple times to write multiple tables to multiple sheets, I end up with just one workbook and one sheet, the one that was called last:
multiple_dfs(dfs_gfk, 'GFK', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_top, 'TOP', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_all, 'Total', 'file_of_tables.xlsx', 1)
So in the end I only have file_of_tables
workbook with only Total
sheet. I know it's a simple problem, but somehow I just can not think of any elegant solution to this. Can anyone help?
回答1:
From the pandas.ExcelWriter
documentation:
You can also append to an existing Excel file:
>>> with ExcelWriter('path_to_file.xlsx', mode='a') as writer:
... df.to_excel(writer, sheet_name='Sheet3')
The mode
keyword matters when you're creating an instance of the ExcelWriter
class.
The mode='w'
allocates space for the file (which it creates once you call .save()
or .close()
) when there isn't one or overwrites one if there is already an existing file.
The mode='a'
assumes there's an existing file and appends on to that file. If you want to keep the structure of your code, you have to add a small line like so:
import pandas as pd
import os
def multiple_dfs(df_list, sheets, file_name, spaces):
arg_mode = 'a' if file_name in os.getcwd() else 'w' # line added
writer = pd.ExcelWriter(file_name, engine='xlsxwriter', mode=arg_mode) # added mode argument
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)
row = row + len(dataframe.index) + spaces + 1
writer.save()
if you then run the following series of code(s):
multiple_dfs(dfs_gfk, 'GFK', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_top, 'TOP', 'file_of_tables.xlsx', 1)
multiple_dfs(dfs_all, 'Total', 'file_of_tables.xlsx', 1)
the last (and second function call) will not overwrite the data currently written in there. Instead what happens is that the first function call creates the file and then the second and third function call append to that data. Now, your function should work.
回答2:
Get writer
outside function with with:
def multiple_dfs(df_list, sheets, writer, spaces):
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)
row = row + len(dataframe.index) + spaces + 1
writer.save()
with pd.ExcelWriter('file_of_tables.xlsx') as writer:
multiple_dfs(dfs_gfk, 'GFK', writer, 1)
multiple_dfs(dfs_top, 'TOP', writer, 1)
multiple_dfs(dfs_all, 'Total', writer, 1)
来源:https://stackoverflow.com/questions/56034923/saving-multiple-dataframes-to-multiple-excel-sheets-multiple-times