Organizing data read from Excel to Pandas DataFrame

后端 未结 1 787
借酒劲吻你
借酒劲吻你 2021-02-15 15:32

My goal with this script is to: 1.read timseries data in from excel file (>100,000k rows) as well as headers (Labels, Units) 2.convert excel numeric dates to best datetime obje

相关标签:
1条回答
  • 2021-02-15 16:11

    You can use pandas directly here, I usually like to create a dictionary of DataFrames (with keys being the sheet name):

    In [11]: xl = pd.ExcelFile("C:\GreenCSV\Calgary\CWater.xlsx")
    
    In [12]: xl.sheet_names  # in your example it may be different
    Out[12]: [u'Sheet1', u'Sheet2', u'Sheet3']
    
    In [13]: dfs = {sheet: xl.parse(sheet) for sheet in xl.sheet_names}
    
    In [14]: dfs['Sheet1'] # access DataFrame by sheet name
    

    You can check out the docs on the parse which offers some more options (for example skiprows), and these allows you to parse individual sheets with much more control...

    0 讨论(0)
提交回复
热议问题