I have dates in Python (pandas) written as \"1/31/2010\". To apply linear regression I want to have 3 separate variables: number of day, number of month, number of year.
This answers only your first question
One solution is to extract attributes of pd.Timestamp
objects using operator.attrgetter
.
The benefit of this method is you can easily expand / change the attributes you require. In addition, the logic is not specific to object type.
from operator import attrgetter
import pandas as pd
df = pd.DataFrame({'date': ['1/21/2010', '5/5/2015', '4/30/2018']})
df['date'] = pd.to_datetime(df['date'], format='%m/%d/%Y')
attr_list = ['day', 'month', 'year']
attrs = attrgetter(*attr_list)
df[attr_list] = df['date'].apply(attrs).apply(pd.Series)
print(df)
date day month year
0 2010-01-21 21 1 2010
1 2015-05-05 5 5 2015
2 2018-04-30 30 4 2018