My goal with this script is to: 1.read timseries data in from excel file (>100,000k rows) as well as headers (Labels, Units) 2.convert excel numeric dates to best datetime obje
You can use pandas directly here, I usually like to create a dictionary of DataFrames (with keys being the sheet name):
In [11]: xl = pd.ExcelFile("C:\GreenCSV\Calgary\CWater.xlsx")
In [12]: xl.sheet_names # in your example it may be different
Out[12]: [u'Sheet1', u'Sheet2', u'Sheet3']
In [13]: dfs = {sheet: xl.parse(sheet) for sheet in xl.sheet_names}
In [14]: dfs['Sheet1'] # access DataFrame by sheet name
You can check out the docs on the parse which offers some more options (for example skiprows
), and these allows you to parse individual sheets with much more control...