Insert row into Excel spreadsheet using openpyxl in Python

前端 未结 11 2241
既然无缘
既然无缘 2020-11-28 15:09

I\'m looking for the best approach for inserting a row into a spreadsheet using openpyxl.

Effectively, I have a spreadsheet (Excel 2007) which has a header row, foll

相关标签:
11条回答
  • 2020-11-28 15:27

    I took Dallas solution and added support for merged cells:

        def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
            skip_list = []
            try:
                idx = row_idx - 1 if above else row_idx
                for (new, old) in zip(range(self.max_row+cnt,idx+cnt,-1),range(self.max_row,idx,-1)):
                    for c_idx in range(1,self.max_column):
                      col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                      print("Copying %s%d to %s%d."%(col,old,col,new))
                      source = self["%s%d"%(col,old)]
                      target = self["%s%d"%(col,new)]
                      if source.coordinate in skip_list:
                          continue
    
                      if source.coordinate in self.merged_cells:
                          # This is a merged cell
                          for _range in self.merged_cell_ranges:
                              merged_cells_list = [x for x in cells_from_range(_range)][0]
                              if source.coordinate in merged_cells_list:
                                  skip_list = merged_cells_list
                                  self.unmerge_cells(_range)
                                  new_range = re.sub(str(old),str(new),_range)
                                  self.merge_cells(new_range)
                                  break
    
                      if source.data_type == Cell.TYPE_FORMULA:
                        target.value = re.sub(
                          "(\$?[A-Z]{1,3})%d"%(old),
                          lambda m: m.group(1) + str(new),
                          source.value
                        )
                      else:
                        target.value = source.value
                      target.number_format = source.number_format
                      target.font   = source.font.copy()
                      target.alignment = source.alignment.copy()
                      target.border = source.border.copy()
                      target.fill   = source.fill.copy()
                idx = idx + 1
                for row in range(idx,idx+cnt):
                    for c_idx in range(1,self.max_column):
                      col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                      #print("Clearing value in cell %s%d"%(col,row))
                      cell = self["%s%d"%(col,row)]
                      cell.value = None
                      source = self["%s%d"%(col,row-1)]
                      if copy_style:
                        cell.number_format = source.number_format
                        cell.font      = source.font.copy()
                        cell.alignment = source.alignment.copy()
                        cell.border    = source.border.copy()
                        cell.fill      = source.fill.copy()
                      if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                        #print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                        cell.value = re.sub(
                          "(\$?[A-Z]{1,3})%d"%(row - 1),
                          lambda m: m.group(1) + str(row),
                          source.value
                        )
    
    0 讨论(0)
  • 2020-11-28 15:29

    To insert row into Excel spreadsheet using openpyxl in Python

    Below code can help you :-

    import openpyxl
    
    file = "xyz.xlsx"
    #loading XL sheet bassed on file name provided by user
    book = openpyxl.load_workbook(file)
    #opening sheet whose index no is 0
    sheet = book.worksheets[0]
    
    #insert_rows(idx, amount=1) Insert row or rows before row==idx, amount will be no of 
    #rows you want to add and it's optional
    sheet.insert_rows(13)
    

    For inserting column also openpyxl have similar function i.e.insert_cols(idx, amount=1)

    0 讨论(0)
  • 2020-11-28 15:31

    I've written a function which will both insert an entire row anywhere you want in a spreadsheet, or an entire 2D table, with openpyxl.

    Every row of the function is explained with a comment but if you want to just insert a single row, just make your row equal to [row]. i.e. if row = [1,2,3,4,5] then set your input to [[1,2,3,4,5]]. If you want this row to be inserted into the top row of your spreadsheet (A1) then Start = [1,1].

    You can indeed overwrite the file name as see you can with my example at the bottom.

    def InputList(Start, List): #This function is to input an array/list from a input start point; len(Start) must equal 2, where Start = [1,1] is cell 1A. List must be a two dimensional array; if you wish to input a single row then this can be done where len(List) == 1, e.g. List = [[1,2,3,4]]
        x = 0 #Sets up a veriable to go through List columns
        y = 0 #Sets up a veriable to go through List rows
        l = 0 #Sets up a veriable to count addional columns against Start[1] to allow for column reset on each new row
        for row in List: #For every row in List
            l = 0 #Set additonal columns to zero
            for cell in row: #For every cell in row
                ws.cell(row=Start[0], column=Start[1]).value = List[y][x] #Set value for current cell
                x = x + 1 #Move to next data input (List) column
                Start[1] = Start[1] + 1 #Move to next Excel column
                l = l + 1 #Count addional row length
            y = y + 1 #Move to next Excel row
            Start[0] = Start[0] + 1 #Move to next Excel row
            x = 0 #Move back to first column of input data (ready for next row)
            Start[1] = Start[1] - l #Reset Excel column back to orignal start column, ready to write next row
    

    Example with single row being inserted at start of row 7:

    from openpyxl import load_workbook
    wb = load_workbook('New3.xlsx')
    ws = wb.active
    
    def InputList(Start, List): #This function is to input an array/list from a input start point; len(Start) must equal 2, where Start = [1,1] is cell 1A. List must be a two dimensional array; if you wish to input a single row then this can be done where len(List) == 1, e.g. List = [[1,2,3,4]]
        x = 0 #Sets up a veriable to go through List columns
        y = 0 #Sets up a veriable to go through List rows
        l = 0 #Sets up a veriable to count addional columns against Start[1] to allow for column reset on each new row
        for row in List: #For every row in List
            l = 0 #Set additonal columns to zero
            for cell in row: #For every cell in row
                ws.cell(row=Start[0], column=Start[1]).value = List[y][x] #Set value for current cell
                x = x + 1 #Move to next data input (List) column
                Start[1] = Start[1] + 1 #Move to next Excel column
                l = l + 1 #Count addional row length
            y = y + 1 #Move to next Excel row
            Start[0] = Start[0] + 1 #Move to next Excel row
            x = 0 #Move back to first column of input data (ready for next row)
            Start[1] = Start[1] - l #Reset Excel column back to orignal start column, ready to write next row
    
    test = [[1,2,3,4]]
    InputList([7,1], test)
    
    wb.save('New3.xlsx')
    
    0 讨论(0)
  • 2020-11-28 15:34

    == Updated to a fully functional version, based on feedback here: groups.google.com/forum/#!topic/openpyxl-users/wHGecdQg3Iw. ==

    As the others have pointed out, openpyxl does not provide this functionality, but I have extended the Worksheet class as follows to implement inserting rows. Hope this proves useful to others.

    def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
        """Inserts new (empty) rows into worksheet at specified row index.
    
        :param row_idx: Row index specifying where to insert new rows.
        :param cnt: Number of rows to insert.
        :param above: Set True to insert rows above specified row index.
        :param copy_style: Set True if new rows should copy style of immediately above row.
        :param fill_formulae: Set True if new rows should take on formula from immediately above row, filled with references new to rows.
    
        Usage:
    
        * insert_rows(2, 10, above=True, copy_style=False)
    
        """
        CELL_RE  = re.compile("(?P<col>\$?[A-Z]+)(?P<row>\$?\d+)")
    
        row_idx = row_idx - 1 if above else row_idx
    
        def replace(m):
            row = m.group('row')
            prefix = "$" if row.find("$") != -1 else ""
            row = int(row.replace("$",""))
            row += cnt if row > row_idx else 0
            return m.group('col') + prefix + str(row)
    
        # First, we shift all cells down cnt rows...
        old_cells = set()
        old_fas   = set()
        new_cells = dict()
        new_fas   = dict()
        for c in self._cells.values():
    
            old_coor = c.coordinate
    
            # Shift all references to anything below row_idx
            if c.data_type == Cell.TYPE_FORMULA:
                c.value = CELL_RE.sub(
                    replace,
                    c.value
                )
                # Here, we need to properly update the formula references to reflect new row indices
                if old_coor in self.formula_attributes and 'ref' in self.formula_attributes[old_coor]:
                    self.formula_attributes[old_coor]['ref'] = CELL_RE.sub(
                        replace,
                        self.formula_attributes[old_coor]['ref']
                    )
    
            # Do the magic to set up our actual shift    
            if c.row > row_idx:
                old_coor = c.coordinate
                old_cells.add((c.row,c.col_idx))
                c.row += cnt
                new_cells[(c.row,c.col_idx)] = c
                if old_coor in self.formula_attributes:
                    old_fas.add(old_coor)
                    fa = self.formula_attributes[old_coor].copy()
                    new_fas[c.coordinate] = fa
    
        for coor in old_cells:
            del self._cells[coor]
        self._cells.update(new_cells)
    
        for fa in old_fas:
            del self.formula_attributes[fa]
        self.formula_attributes.update(new_fas)
    
        # Next, we need to shift all the Row Dimensions below our new rows down by cnt...
        for row in range(len(self.row_dimensions)-1+cnt,row_idx+cnt,-1):
            new_rd = copy.copy(self.row_dimensions[row-cnt])
            new_rd.index = row
            self.row_dimensions[row] = new_rd
            del self.row_dimensions[row-cnt]
    
        # Now, create our new rows, with all the pretty cells
        row_idx += 1
        for row in range(row_idx,row_idx+cnt):
            # Create a Row Dimension for our new row
            new_rd = copy.copy(self.row_dimensions[row-1])
            new_rd.index = row
            self.row_dimensions[row] = new_rd
            for col in range(1,self.max_column):
                col = get_column_letter(col)
                cell = self.cell('%s%d'%(col,row))
                cell.value = None
                source = self.cell('%s%d'%(col,row-1))
                if copy_style:
                    cell.number_format = source.number_format
                    cell.font      = source.font.copy()
                    cell.alignment = source.alignment.copy()
                    cell.border    = source.border.copy()
                    cell.fill      = source.fill.copy()
                if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                    s_coor = source.coordinate
                    if s_coor in self.formula_attributes and 'ref' not in self.formula_attributes[s_coor]:
                        fa = self.formula_attributes[s_coor].copy()
                        self.formula_attributes[cell.coordinate] = fa
                    # print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                    cell.value = re.sub(
                        "(\$?[A-Z]{1,3}\$?)%d"%(row - 1),
                        lambda m: m.group(1) + str(row),
                        source.value
                    )   
                    cell.data_type = Cell.TYPE_FORMULA
    
        # Check for Merged Cell Ranges that need to be expanded to contain new cells
        for cr_idx, cr in enumerate(self.merged_cell_ranges):
            self.merged_cell_ranges[cr_idx] = CELL_RE.sub(
                replace,
                cr
            )
    
    Worksheet.insert_rows = insert_rows
    
    0 讨论(0)
  • 2020-11-28 15:34

    Unfortunately there isn't really a better way to do in that read in the file, and use a library like xlwt to write out a new excel file (with your new row inserted at the top). Excel doesn't work like a database that you can read and and append to. You unfortunately just have to read in the information and manipulate in memory and write out to what is essentially a new file.

    0 讨论(0)
  • 2020-11-28 15:36

    Openpyxl Worksheets have limited functionality when it comes to doing row or column level operations. The only properties a Worksheet has that relates to rows/columns are the properties row_dimensions and column_dimensions, which store "RowDimensions" and "ColumnDimensions" objects for each row and column, respectively. These dictionaries are also used in function like get_highest_row() and get_highest_column().

    Everything else operates on a cell level, with Cell objects being tracked in the dictionary, _cells (and their style tracked in the dictionary _styles). Most functions that look like they're doing anything on a row or column level are actually operating on a range of cells (such as the aforementioned append()).

    The simplest thing to do would be what you suggested: create a new sheet, append your header row, append your new data rows, append your old data rows, delete the old sheet, then rename your new sheet to the old one. Problems that may be presented with this method is the loss of row/column dimensions attributes and cell styles, unless you specifically copy them, too.

    Alternatively, you could create your own functions that insert rows or columns.

    I had a large number of very simple worksheets that I needed to delete columns from. Since you asked for explicit examples, I'll provide the function I quickly threw together to do this:

    from openpyxl.cell import get_column_letter
    
    def ws_delete_column(sheet, del_column):
    
        for row_num in range(1, sheet.get_highest_row()+1):
            for col_num in range(del_column, sheet.get_highest_column()+1):
    
                coordinate = '%s%s' % (get_column_letter(col_num),
                                       row_num)
                adj_coordinate = '%s%s' % (get_column_letter(col_num + 1),
                                           row_num)
    
                # Handle Styles.
                # This is important to do if you have any differing
                # 'types' of data being stored, as you may otherwise get
                # an output Worksheet that's got improperly formatted cells.
                # Or worse, an error gets thrown because you tried to copy
                # a string value into a cell that's styled as a date.
    
                if adj_coordinate in sheet._styles:
                    sheet._styles[coordinate] = sheet._styles[adj_coordinate]
                    sheet._styles.pop(adj_coordinate, None)
                else:
                    sheet._styles.pop(coordinate, None)
    
                if adj_coordinate in sheet._cells:
                    sheet._cells[coordinate] = sheet._cells[adj_coordinate]
                    sheet._cells[coordinate].column = get_column_letter(col_num)
                    sheet._cells[coordinate].row = row_num
                    sheet._cells[coordinate].coordinate = coordinate
    
                    sheet._cells.pop(adj_coordinate, None)
                else:
                    sheet._cells.pop(coordinate, None)
    
            # sheet.garbage_collect()
    

    I pass it the worksheet that I'm working with, and the column number I want deleted, and away it goes. I know it isn't exactly what you wanted, but I hope this information helped!

    EDIT: Noticed someone gave this another vote, and figured I should update it. The co-ordinate system in Openpyxl experienced some changes sometime in the passed couple years, introducing a coordinate attribute for items in _cell. This needs to be edited, too, or the rows will be left blank (instead of deleted), and Excel will throw an error about problems with the file. This works for Openpyxl 2.2.3 (untested with later versions)

    0 讨论(0)
提交回复
热议问题