I have a dataframe where the names of the columns are dates (Year-month) in the form of strings. How can I convert these names in datetime format? I tried doing this:
As an expansion to jezrael's answer, the original code will be trying to slice the df array by the array stored in new_cols and store the result as df - but since those values don't exist in df yet it returns an error saying it can't find that index to slice by.
As such you need to declare that you're changing the name of the columns, as in jezrael's answer.
If select by loc
columns values was not changed, so get KeyError
.
So you need assign output to columns
:
df.columns = pd.to_datetime(df.columns)
Sample:
cols = ['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01']
vals = np.arange(5)
df = pd.DataFrame(columns = cols, data=[vals])
print (df)
2000-01-01 2000-02-01 2000-03-01 2000-04-01 2000-05-01
0 0 1 2 3 4
print (df.columns)
Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')
df.columns = pd.to_datetime(df.columns)
print (df.columns)
DatetimeIndex(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01',
'2000-05-01'],
dtype='datetime64[ns]', freq=None)
Also is possible convert to period:
print (df.columns)
Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')
df.columns = pd.to_datetime(df.columns).to_period('M')
print (df.columns)
PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05'],
dtype='period[M]', freq='M')