In python removing rows from a excel file using xlrd, xlwt, and xlutils

后端 未结 3 857
情话喂你
情话喂你 2021-01-14 16:51

Hello everyone and thank you in advance.

I have a python script where I am opening a template excel file, adding data (while preserving the style) and saving again.

相关标签:
3条回答
  • 2021-01-14 17:19

    I achieved using Pandas package....

    import pandas as pd
    
    #Read from Excel
    xl= pd.ExcelFile("test.xls")
    
    #Parsing Excel Sheet to DataFrame
    dfs = xl.parse(xl.sheet_names[0])
    
    #Update DataFrame as per requirement
    #(Here Removing the row from DataFrame having blank value in "Name" column)
    
    dfs = dfs[dfs['Name'] != '']
    
    #Updating the excel sheet with the updated DataFrame
    
    dfs.to_excel("test.xls",sheet_name='Sheet1',index=False)
    
    0 讨论(0)
  • 2021-01-14 17:35

    For those of us still stuck with xlrd/xlwt/xlutils, here's a filter you could use:

    from xlutils.filter import BaseFilter
    
    class RowFilter(BaseFilter):
        rows_to_exclude: "Iterable[int]"
        _next_output_row: int
    
        def __init__(
                self,
                rows_to_exclude: "Iterable[int]",
        ):
            self.rows_to_exclude = rows_to_exclude
            self._next_output_row = -1
    
        def _should_include_row(self, rdrowx):
            return rdrowx not in self.rows_to_exclude
    
        def row(self, rdrowx, wtrowx):
            if self._should_include_row(rdrowx):
                # Proceed with writing out the row to the output file
                self._next_output_row += 1
                self.next.row(
                    rdrowx, self._next_output_row,
                )
    
        # After `row()` has been called, `cell()` is called for each cell of the row
        def cell(self, rdrowx, rdcolx, wtrowx, wtcolx):
            if self._should_include_row(rdrowx):
                self.next.cell(
                    rdrowx, rdcolx, self._next_output_row, wtcolx,
                )
    
    

    Then put it to use with e.g.:

    from xlrd import open_workbook
    from xlutils.filter import DirectoryWriter, XLRDReader
    
    xlutils.filter.process(
        XLRDReader(open_workbook("input_filename.xls", "output_filename.xls")),
        RowFilter([3, 4, 5]),
        DirectoryWriter("output_dir"),
    )
    
    0 讨论(0)
  • 2021-01-14 17:36

    xlwt does not provide a simple interface for doing this, but I've had success with a somewhat similar problem (inserting multiple copies of a row into a copied workbook) by directly changing the worksheet's rows attribute and the row numbers on the row and cell objects.

    Given the number of rows you want to delete and the starting number of the first row you want to keep, something like this might work:

    rows_to_move = worksheet.rows[first_kept_row:]
    for row in rows_to_move:
        new_row_number = row._Row__idx - number_to_delete
        row._Row__idx = new_row_number
        for cell in row._Row__cells.values():
            if cell:
                cell.rowx = new_row_number
        worksheet.rows[new_row_number] = row
    # now delete any remaining rows
    del worksheet.rows[new_row_number + 1:]
    

    Do you have merged ranges in the rows you want to delete, or below them? If so you'll also need to run through the worksheet's merged_ranges attribute and update the rows for them. Also, if you have more rows to delete than rows in your footer, you'll need to

    As a side note - I was able to write text to my worksheet and preserve the predefined style thus:

    def write_with_style(ws, row, col, value):
        if ws.rows[row]._Row__cells[col]:
            old_xf_idx = ws.rows[row]._Row__cells[col].xf_idx
            ws.write(row, col, value)
            ws.rows[row]._Row__cells[col].xf_idx = old_xf_idx
        else:
            ws.write(row, col, value)
    

    That might let you skip having two copies of your spreadsheet open at once.

    0 讨论(0)
提交回复
热议问题