How to save a new sheet in an existing excel file, using Pandas?

匿名 (未验证) 提交于 2019-12-03 01:27:01

问题:

I want to use excel files to store data elaborated with python. My problem is that I can't add sheets to an existing excel file. Here I suggest a sample code to work with in order to reach this issue

import pandas as pd import numpy as np  path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"  x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1)  x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2)  writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.save() writer.close() 

This code saves two DataFrames to two sheets, named "x1" and "x2" respectively. If I create two new DataFrames and try to use the same code to add two new sheets, 'x3' and 'x4', the original data is lost.

import pandas as pd import numpy as np  path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"  x3 = np.random.randn(100, 2) df3 = pd.DataFrame(x3)  x4 = np.random.randn(100, 2) df4 = pd.DataFrame(x4)  writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df3.to_excel(writer, sheet_name = 'x3') df4.to_excel(writer, sheet_name = 'x4') writer.save() writer.close() 

I want an excel file with four sheets: 'x1', 'x2', 'x3', 'x4'. I know that 'xlsxwriter' is not the only "engine", there is 'openpyxl'. I also saw there are already other people that have written about this issue, but still I can't understand how to do that.

Here a code taken from this link

import pandas from openpyxl import load_workbook  book = load_workbook('Masterfile.xlsx') writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl')  writer.book = book writer.sheets = dict((ws.title, ws) for ws in book.worksheets)  data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])  writer.save() 

They say that it works, but it is hard to figure out how. I don't understand what "ws.title", "ws", and "dict" are in this context.

Which is the best way to save "x1" and "x2", then close the file, open it again and add "x3" and "x4"?

回答1:

Thank you. I believe that a complete example could be good for anyone else have the some issue:

import pandas as pd import numpy as np  path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"  x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1)  x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2)  writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.save() writer.close() 

Here I generate an excel file, from my understanding it does not really matter whether it is generated via the "xslxwriter" or the "openpyxl" engine.

When I want to write without loosing the original data then

import pandas as pd import numpy as np from openpyxl import load_workbook  path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"  book = load_workbook(path) writer = pd.ExcelWriter(path, engine = 'openpyxl') writer.book = book  x3 = np.random.randn(100, 2) df3 = pd.DataFrame(x3)  x4 = np.random.randn(100, 2) df4 = pd.DataFrame(x4)  df3.to_excel(writer, sheet_name = 'x3') df4.to_excel(writer, sheet_name = 'x4') writer.save() writer.close() 

this code do the job!



回答2:

In the example you shared you are loading the existing file into book and setting the writer.book value to be book. In the line writer.sheets = dict((ws.title, ws) for ws in book.worksheets) you are accessing each sheet in the workbook as ws. The sheet title is then ws so you are creating a dictionary of {sheet_titles: sheet} key, value pairs. This dictionary is then set to writer.sheets. Essentially these steps are just loading the existing data from 'Masterfile.xlsx' and populating your writer with them.

Now let's say you already have a file with x1 and x2 as sheets. You can use the example code to load the file and then could do something like this to add x3 and x4.

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" writer = pd.ExcelWriter(path, engine='openpyxl') df3.to_excel(writer, 'x3', index=False) df4.to_excel(writer, 'x4', index=False) writer.save() 

That should do what you are looking for.



回答3:

I would strongly recommend you work directly with openpyxl since it now supports Pandas DataFrames.

This allows you to concentrate on the relevant Excel and Pandas code.



回答4:

A simple example for writing multiple data to excel at a time. And also when you want to append data to a sheet on a written excel file (closed excel file).

When it is your first time writing to an excel. (Writing "df1" and "df2" to "1st_sheet" and "2nd_sheet")

import pandas as pd  from openpyxl import load_workbook  df1 = pd.DataFrame([[1],[1]], columns=['a']) df2 = pd.DataFrame([[2],[2]], columns=['b']) df3 = pd.DataFrame([[3],[3]], columns=['c'])  excel_dir = "my/excel/dir"  with pd.ExcelWriter(excel_dir, engine='xlsxwriter') as writer:         df1.to_excel(writer, '1st_sheet')        df2.to_excel(writer, '2nd_sheet')        writer.save()     

After you close your excel, but you wish to "append" data on the same excel file but another sheet, let's say "df3" to sheet name "3rd_sheet".

book = load_workbook(excel_dir) with pd.ExcelWriter(excel_dir, engine='openpyxl') as writer:     writer.book = book     writer.sheets = dict((ws.title, ws) for ws in book.worksheets)          ## Your dataframe to append.      df3.to_excel(writer, '3rd_sheet')        writer.save()      

Be noted that excel format must not be xls, you may use xlsx one.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!