Python Pandas dataframe reading exact specified range in an excel sheet

后端 未结 3 1151
北海茫月
北海茫月 2020-12-28 19:17

I have a lot of different table (and other unstructured data in an excel sheet) .. I need to create a dataframe out of range \'A3:D20\' from \'Sheet2\' of Excel sheet \'data

相关标签:
3条回答
  • 2020-12-28 19:50

    my answer with pandas O.25 tested and worked well

    pd.read_excel('resultat-elections-2012.xls', sheet_name = 'France entière T1T2', skiprows = 2,  nrows= 5, usecols = 'A:H')
    pd.read_excel('resultat-elections-2012.xls', index_col = None, skiprows= 2, nrows= 5, sheet_name='France entière T1T2', usecols=range(0,8))
    

    So : i need data after two first lines ; selected desired lines (5) and col A to H.
    Be carefull @shane answer's need to be improved and updated with the new parameters of Pandas

    my original excel

    my process with pandas read_excel

    0 讨论(0)
  • 2020-12-28 19:55

    One way to do this is to use the openpyxl module.

    Here's an example:

    from openpyxl import load_workbook
    
    wb = load_workbook(filename='data.xlsx', 
                       read_only=True)
    
    ws = wb['Sheet2']
    
    # Read the cell values into a list of lists
    data_rows = []
    for row in ws['A3':'D20']:
        data_cols = []
        for cell in row:
            data_cols.append(cell.value)
        data_rows.append(data_cols)
    
    # Transform into dataframe
    import pandas as pd
    df = pd.DataFrame(data_rows)
    
    0 讨论(0)
  • 2020-12-28 19:55

    Use the following arguments from pandas read_excel documentation:

    • skiprows : list-like
      • Rows to skip at the beginning (0-indexed)
    • parse_cols : int or list, default None
      • If None then parse all columns,
      • If int then indicates last column to be parsed
      • If list of ints then indicates list of column numbers to be parsed
      • If string then indicates comma separated list of column names and column ranges (e.g. “A:E” or “A,C,E:F”)

    I imagine the call will look like:

    df = read_excel(filename, 'Sheet2', skiprows = 2, parse_cols = 'A:D')
    
    0 讨论(0)
提交回复
热议问题