I\'m looking for the best approach for inserting a row into a spreadsheet using openpyxl.
Effectively, I have a spreadsheet (Excel 2007) which has a header row, foll
I took Dallas solution and added support for merged cells:
def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
skip_list = []
try:
idx = row_idx - 1 if above else row_idx
for (new, old) in zip(range(self.max_row+cnt,idx+cnt,-1),range(self.max_row,idx,-1)):
for c_idx in range(1,self.max_column):
col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
print("Copying %s%d to %s%d."%(col,old,col,new))
source = self["%s%d"%(col,old)]
target = self["%s%d"%(col,new)]
if source.coordinate in skip_list:
continue
if source.coordinate in self.merged_cells:
# This is a merged cell
for _range in self.merged_cell_ranges:
merged_cells_list = [x for x in cells_from_range(_range)][0]
if source.coordinate in merged_cells_list:
skip_list = merged_cells_list
self.unmerge_cells(_range)
new_range = re.sub(str(old),str(new),_range)
self.merge_cells(new_range)
break
if source.data_type == Cell.TYPE_FORMULA:
target.value = re.sub(
"(\$?[A-Z]{1,3})%d"%(old),
lambda m: m.group(1) + str(new),
source.value
)
else:
target.value = source.value
target.number_format = source.number_format
target.font = source.font.copy()
target.alignment = source.alignment.copy()
target.border = source.border.copy()
target.fill = source.fill.copy()
idx = idx + 1
for row in range(idx,idx+cnt):
for c_idx in range(1,self.max_column):
col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
#print("Clearing value in cell %s%d"%(col,row))
cell = self["%s%d"%(col,row)]
cell.value = None
source = self["%s%d"%(col,row-1)]
if copy_style:
cell.number_format = source.number_format
cell.font = source.font.copy()
cell.alignment = source.alignment.copy()
cell.border = source.border.copy()
cell.fill = source.fill.copy()
if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
#print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
cell.value = re.sub(
"(\$?[A-Z]{1,3})%d"%(row - 1),
lambda m: m.group(1) + str(row),
source.value
)
To insert row into Excel spreadsheet using openpyxl in Python
Below code can help you :-
import openpyxl
file = "xyz.xlsx"
#loading XL sheet bassed on file name provided by user
book = openpyxl.load_workbook(file)
#opening sheet whose index no is 0
sheet = book.worksheets[0]
#insert_rows(idx, amount=1) Insert row or rows before row==idx, amount will be no of
#rows you want to add and it's optional
sheet.insert_rows(13)
For inserting column also openpyxl have similar function i.e.insert_cols(idx, amount=1)
I've written a function which will both insert an entire row anywhere you want in a spreadsheet, or an entire 2D table, with openpyxl.
Every row of the function is explained with a comment but if you want to just insert a single row, just make your row equal to [row]. i.e. if row = [1,2,3,4,5] then set your input to [[1,2,3,4,5]]. If you want this row to be inserted into the top row of your spreadsheet (A1) then Start = [1,1].
You can indeed overwrite the file name as see you can with my example at the bottom.
def InputList(Start, List): #This function is to input an array/list from a input start point; len(Start) must equal 2, where Start = [1,1] is cell 1A. List must be a two dimensional array; if you wish to input a single row then this can be done where len(List) == 1, e.g. List = [[1,2,3,4]]
x = 0 #Sets up a veriable to go through List columns
y = 0 #Sets up a veriable to go through List rows
l = 0 #Sets up a veriable to count addional columns against Start[1] to allow for column reset on each new row
for row in List: #For every row in List
l = 0 #Set additonal columns to zero
for cell in row: #For every cell in row
ws.cell(row=Start[0], column=Start[1]).value = List[y][x] #Set value for current cell
x = x + 1 #Move to next data input (List) column
Start[1] = Start[1] + 1 #Move to next Excel column
l = l + 1 #Count addional row length
y = y + 1 #Move to next Excel row
Start[0] = Start[0] + 1 #Move to next Excel row
x = 0 #Move back to first column of input data (ready for next row)
Start[1] = Start[1] - l #Reset Excel column back to orignal start column, ready to write next row
Example with single row being inserted at start of row 7:
from openpyxl import load_workbook
wb = load_workbook('New3.xlsx')
ws = wb.active
def InputList(Start, List): #This function is to input an array/list from a input start point; len(Start) must equal 2, where Start = [1,1] is cell 1A. List must be a two dimensional array; if you wish to input a single row then this can be done where len(List) == 1, e.g. List = [[1,2,3,4]]
x = 0 #Sets up a veriable to go through List columns
y = 0 #Sets up a veriable to go through List rows
l = 0 #Sets up a veriable to count addional columns against Start[1] to allow for column reset on each new row
for row in List: #For every row in List
l = 0 #Set additonal columns to zero
for cell in row: #For every cell in row
ws.cell(row=Start[0], column=Start[1]).value = List[y][x] #Set value for current cell
x = x + 1 #Move to next data input (List) column
Start[1] = Start[1] + 1 #Move to next Excel column
l = l + 1 #Count addional row length
y = y + 1 #Move to next Excel row
Start[0] = Start[0] + 1 #Move to next Excel row
x = 0 #Move back to first column of input data (ready for next row)
Start[1] = Start[1] - l #Reset Excel column back to orignal start column, ready to write next row
test = [[1,2,3,4]]
InputList([7,1], test)
wb.save('New3.xlsx')
== Updated to a fully functional version, based on feedback here: groups.google.com/forum/#!topic/openpyxl-users/wHGecdQg3Iw. ==
As the others have pointed out, openpyxl
does not provide this functionality, but I have extended the Worksheet
class as follows to implement inserting rows. Hope this proves useful to others.
def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
"""Inserts new (empty) rows into worksheet at specified row index.
:param row_idx: Row index specifying where to insert new rows.
:param cnt: Number of rows to insert.
:param above: Set True to insert rows above specified row index.
:param copy_style: Set True if new rows should copy style of immediately above row.
:param fill_formulae: Set True if new rows should take on formula from immediately above row, filled with references new to rows.
Usage:
* insert_rows(2, 10, above=True, copy_style=False)
"""
CELL_RE = re.compile("(?P<col>\$?[A-Z]+)(?P<row>\$?\d+)")
row_idx = row_idx - 1 if above else row_idx
def replace(m):
row = m.group('row')
prefix = "$" if row.find("$") != -1 else ""
row = int(row.replace("$",""))
row += cnt if row > row_idx else 0
return m.group('col') + prefix + str(row)
# First, we shift all cells down cnt rows...
old_cells = set()
old_fas = set()
new_cells = dict()
new_fas = dict()
for c in self._cells.values():
old_coor = c.coordinate
# Shift all references to anything below row_idx
if c.data_type == Cell.TYPE_FORMULA:
c.value = CELL_RE.sub(
replace,
c.value
)
# Here, we need to properly update the formula references to reflect new row indices
if old_coor in self.formula_attributes and 'ref' in self.formula_attributes[old_coor]:
self.formula_attributes[old_coor]['ref'] = CELL_RE.sub(
replace,
self.formula_attributes[old_coor]['ref']
)
# Do the magic to set up our actual shift
if c.row > row_idx:
old_coor = c.coordinate
old_cells.add((c.row,c.col_idx))
c.row += cnt
new_cells[(c.row,c.col_idx)] = c
if old_coor in self.formula_attributes:
old_fas.add(old_coor)
fa = self.formula_attributes[old_coor].copy()
new_fas[c.coordinate] = fa
for coor in old_cells:
del self._cells[coor]
self._cells.update(new_cells)
for fa in old_fas:
del self.formula_attributes[fa]
self.formula_attributes.update(new_fas)
# Next, we need to shift all the Row Dimensions below our new rows down by cnt...
for row in range(len(self.row_dimensions)-1+cnt,row_idx+cnt,-1):
new_rd = copy.copy(self.row_dimensions[row-cnt])
new_rd.index = row
self.row_dimensions[row] = new_rd
del self.row_dimensions[row-cnt]
# Now, create our new rows, with all the pretty cells
row_idx += 1
for row in range(row_idx,row_idx+cnt):
# Create a Row Dimension for our new row
new_rd = copy.copy(self.row_dimensions[row-1])
new_rd.index = row
self.row_dimensions[row] = new_rd
for col in range(1,self.max_column):
col = get_column_letter(col)
cell = self.cell('%s%d'%(col,row))
cell.value = None
source = self.cell('%s%d'%(col,row-1))
if copy_style:
cell.number_format = source.number_format
cell.font = source.font.copy()
cell.alignment = source.alignment.copy()
cell.border = source.border.copy()
cell.fill = source.fill.copy()
if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
s_coor = source.coordinate
if s_coor in self.formula_attributes and 'ref' not in self.formula_attributes[s_coor]:
fa = self.formula_attributes[s_coor].copy()
self.formula_attributes[cell.coordinate] = fa
# print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
cell.value = re.sub(
"(\$?[A-Z]{1,3}\$?)%d"%(row - 1),
lambda m: m.group(1) + str(row),
source.value
)
cell.data_type = Cell.TYPE_FORMULA
# Check for Merged Cell Ranges that need to be expanded to contain new cells
for cr_idx, cr in enumerate(self.merged_cell_ranges):
self.merged_cell_ranges[cr_idx] = CELL_RE.sub(
replace,
cr
)
Worksheet.insert_rows = insert_rows
Unfortunately there isn't really a better way to do in that read in the file, and use a library like xlwt to write out a new excel file (with your new row inserted at the top). Excel doesn't work like a database that you can read and and append to. You unfortunately just have to read in the information and manipulate in memory and write out to what is essentially a new file.
Openpyxl Worksheets have limited functionality when it comes to doing row or column level operations. The only properties a Worksheet has that relates to rows/columns are the properties row_dimensions
and column_dimensions
, which store "RowDimensions" and "ColumnDimensions" objects for each row and column, respectively. These dictionaries are also used in function like get_highest_row()
and get_highest_column()
.
Everything else operates on a cell level, with Cell objects being tracked in the dictionary, _cells
(and their style tracked in the dictionary _styles
). Most functions that look like they're doing anything on a row or column level are actually operating on a range of cells (such as the aforementioned append()
).
The simplest thing to do would be what you suggested: create a new sheet, append your header row, append your new data rows, append your old data rows, delete the old sheet, then rename your new sheet to the old one. Problems that may be presented with this method is the loss of row/column dimensions attributes and cell styles, unless you specifically copy them, too.
Alternatively, you could create your own functions that insert rows or columns.
I had a large number of very simple worksheets that I needed to delete columns from. Since you asked for explicit examples, I'll provide the function I quickly threw together to do this:
from openpyxl.cell import get_column_letter
def ws_delete_column(sheet, del_column):
for row_num in range(1, sheet.get_highest_row()+1):
for col_num in range(del_column, sheet.get_highest_column()+1):
coordinate = '%s%s' % (get_column_letter(col_num),
row_num)
adj_coordinate = '%s%s' % (get_column_letter(col_num + 1),
row_num)
# Handle Styles.
# This is important to do if you have any differing
# 'types' of data being stored, as you may otherwise get
# an output Worksheet that's got improperly formatted cells.
# Or worse, an error gets thrown because you tried to copy
# a string value into a cell that's styled as a date.
if adj_coordinate in sheet._styles:
sheet._styles[coordinate] = sheet._styles[adj_coordinate]
sheet._styles.pop(adj_coordinate, None)
else:
sheet._styles.pop(coordinate, None)
if adj_coordinate in sheet._cells:
sheet._cells[coordinate] = sheet._cells[adj_coordinate]
sheet._cells[coordinate].column = get_column_letter(col_num)
sheet._cells[coordinate].row = row_num
sheet._cells[coordinate].coordinate = coordinate
sheet._cells.pop(adj_coordinate, None)
else:
sheet._cells.pop(coordinate, None)
# sheet.garbage_collect()
I pass it the worksheet that I'm working with, and the column number I want deleted, and away it goes. I know it isn't exactly what you wanted, but I hope this information helped!
EDIT: Noticed someone gave this another vote, and figured I should update it. The co-ordinate system in Openpyxl experienced some changes sometime in the passed couple years, introducing a coordinate
attribute for items in _cell
. This needs to be edited, too, or the rows will be left blank (instead of deleted), and Excel will throw an error about problems with the file. This works for Openpyxl 2.2.3 (untested with later versions)