可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I want to use excel files to store data elaborated with python. My problem is that I can't add sheets to an existing excel file. Here I suggest a sample code to work with in order to reach this issue
import pandas as pd import numpy as np path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1) x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.save() writer.close()
This code saves two DataFrames to two sheets, named "x1" and "x2" respectively. If I create two new DataFrames and try to use the same code to add two new sheets, 'x3' and 'x4', the original data is lost.
import pandas as pd import numpy as np path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" x3 = np.random.randn(100, 2) df3 = pd.DataFrame(x3) x4 = np.random.randn(100, 2) df4 = pd.DataFrame(x4) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df3.to_excel(writer, sheet_name = 'x3') df4.to_excel(writer, sheet_name = 'x4') writer.save() writer.close()
I want an excel file with four sheets: 'x1', 'x2', 'x3', 'x4'. I know that 'xlsxwriter' is not the only "engine", there is 'openpyxl'. I also saw there are already other people that have written about this issue, but still I can't understand how to do that.
Here a code taken from this link
import pandas from openpyxl import load_workbook book = load_workbook('Masterfile.xlsx') writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') writer.book = book writer.sheets = dict((ws.title, ws) for ws in book.worksheets) data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2']) writer.save()
They say that it works, but it is hard to figure out how. I don't understand what "ws.title", "ws", and "dict" are in this context.
Which is the best way to save "x1" and "x2", then close the file, open it again and add "x3" and "x4"?
回答1:
Thank you. I believe that a complete example could be good for anyone else have the some issue:
import pandas as pd import numpy as np path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1) x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.save() writer.close()
Here I generate an excel file, from my understanding it does not really matter whether it is generated via the "xslxwriter" or the "openpyxl" engine.
When I want to write without loosing the original data then
import pandas as pd import numpy as np from openpyxl import load_workbook path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" book = load_workbook(path) writer = pd.ExcelWriter(path, engine = 'openpyxl') writer.book = book x3 = np.random.randn(100, 2) df3 = pd.DataFrame(x3) x4 = np.random.randn(100, 2) df4 = pd.DataFrame(x4) df3.to_excel(writer, sheet_name = 'x3') df4.to_excel(writer, sheet_name = 'x4') writer.save() writer.close()
this code do the job!
回答2:
In the example you shared you are loading the existing file into book
and setting the writer.book
value to be book
. In the line writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
you are accessing each sheet in the workbook as ws
. The sheet title is then ws
so you are creating a dictionary of {sheet_titles: sheet}
key, value pairs. This dictionary is then set to writer.sheets. Essentially these steps are just loading the existing data from 'Masterfile.xlsx'
and populating your writer with them.
Now let's say you already have a file with x1
and x2
as sheets. You can use the example code to load the file and then could do something like this to add x3
and x4
.
path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" writer = pd.ExcelWriter(path, engine='openpyxl') df3.to_excel(writer, 'x3', index=False) df4.to_excel(writer, 'x4', index=False) writer.save()
That should do what you are looking for.
回答3:
I would strongly recommend you work directly with openpyxl since it now supports Pandas DataFrames.
This allows you to concentrate on the relevant Excel and Pandas code.
回答4:
A simple example for writing multiple data to excel at a time. And also when you want to append data to a sheet on a written excel file (closed excel file).
When it is your first time writing to an excel. (Writing "df1" and "df2" to "1st_sheet" and "2nd_sheet")
import pandas as pd from openpyxl import load_workbook df1 = pd.DataFrame([[1],[1]], columns=['a']) df2 = pd.DataFrame([[2],[2]], columns=['b']) df3 = pd.DataFrame([[3],[3]], columns=['c']) excel_dir = "my/excel/dir" with pd.ExcelWriter(excel_dir, engine='xlsxwriter') as writer: df1.to_excel(writer, '1st_sheet') df2.to_excel(writer, '2nd_sheet') writer.save()
After you close your excel, but you wish to "append" data on the same excel file but another sheet, let's say "df3" to sheet name "3rd_sheet".
book = load_workbook(excel_dir) with pd.ExcelWriter(excel_dir, engine='openpyxl') as writer: writer.book = book writer.sheets = dict((ws.title, ws) for ws in book.worksheets) ## Your dataframe to append. df3.to_excel(writer, '3rd_sheet') writer.save()
Be noted that excel format must not be xls, you may use xlsx one.