I would like to format a spreadsheet (xls or xlsx) so that any cells containing a word or ending with a certain string are formatted by filling the background with a specifi
Preserve the original input.xls
formatting when you open it:
from xlrd import open_workbook
input_wb = open_workbook('input.xls', formatting_info=True)
Create a new workbook based on this template:
from xlutils.copy import copy as copy_workbook
output_wb = copy_workbook(input_wb)
Define some new cell styles:
from xlwt import easyxf
red_background = easyxf("pattern: pattern solid, fore_color red;")
black_with_white_font = easyxf('pattern: pattern solid, fore_color black; font: color-index white, bold on;")
Evaluate and modify your cells:
input_ws = input_wb.sheet_by_name('StackOverflow')
output_ws = output_wb.get_sheet(0)
for rindex in range(0, input_ws.nrows):
for cindex in range(0, input_ws.ncols):
input_cell = input_ws.cell(rindex, cindex)
if input_cell.value[ input_cell.value.rfind('.'): ] == 'pf':
output_ws.write(rindex, cindex, input_cell.value, red_background)
elif input_cell.value.find('deleted') >= 0:
output_ws.write(rindex, cindex, input_cell.value, black_with_white_font)
else:
pass # we don't need to modify it
Save your new workbook
output_wb.save('output.xls')
Using the above example, unmodified cells should have their original formatting intact.
Should you need to alter the cell content AND would like to preserve the original formatting (i.e. NOT use your custom easyxf
instance), you may use this snippet:
def changeCell(worksheet, row, col, text):
""" Changes a worksheet cell text while preserving formatting """
# Adapted from https://stackoverflow.com/a/7686555/1545769
previousCell = worksheet._Worksheet__rows.get(row)._Row__cells.get(col)
worksheet.write(row, col, text)
newCell = worksheet._Worksheet__rows.get(row)._Row__cells.get(col)
newCell.xf_idx = previousCell.xf_idx
# ...
changeCell(worksheet_instance, 155, 2, "New Value")
For the comparisons, you can use the string methods find
and rfind
(which searches from the right). They return the index of the position of the substring within the string. They return -1
if the substring is not found. Ergo, you see above input_cell.value.find('deleted') >= 0
to evaluate whether or not the substring 'deleted' exists. For the .pf
comparison, I used rfind
as well as something in Python called slicing.