How to sort Excel sheet using Python

后端 未结 2 1247
星月不相逢
星月不相逢 2021-01-06 13:44

I am using Python 3.4 and xlrd. I want to sort the Excel sheet based on the primary column before processing it. Is there any library to perform this ?

2条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-06 14:12

    There are a couple ways to do this. The first option is to utilize xlrd, as you have this tagged. The biggest downside to this is that it doesn't natively write to XLSX format.

    These examples use an excel document with this format:

    Text Excel Layout

    Utilizing xlrd and a few modifications from this answer:

    import xlwt
    from xlrd import open_workbook
    
    target_column = 0     # This example only has 1 column, and it is 0 indexed
    
    book = open_workbook('test.xlsx')
    sheet = book.sheets()[0]
    data = [sheet.row_values(i) for i in xrange(sheet.nrows)]
    labels = data[0]    # Don't sort our headers
    data = data[1:]     # Data begins on the second row
    data.sort(key=lambda x: x[target_column])
    
    bk = xlwt.Workbook()
    sheet = bk.add_sheet(sheet.name)
    
    for idx, label in enumerate(labels):
         sheet.write(0, idx, label)
    
    for idx_r, row in enumerate(data):
        for idx_c, value in enumerate(row):
            sheet.write(idx_r+1, idx_c, value)
    
    bk.save('result.xls')    # Notice this is xls, not xlsx like the original file is
    

    This outputs the following workbook:

    XLRD output


    Another option (and one that can utilize XLSX output) is to utilize pandas. The code is also shorter:

    import pandas as pd
    
    xl = pd.ExcelFile("test.xlsx")
    df = xl.parse("Sheet1")
    df = df.sort(columns="Header Row")
    
    writer = pd.ExcelWriter('output.xlsx')
    df.to_excel(writer,sheet_name='Sheet1',columns=["Header Row"],index=False)
    writer.save()
    

    This outputs:

    Pandas Output

    In the to_excel call, the index is set to False, so that the Pandas dataframe index isn't included in the excel document. The rest of the keywords should be self explanatory.

提交回复
热议问题