I have a csv file which I am trying to complete operations on. I have created a dataframe with one column titled \"start_date\" which has the date of warranty start. The pro
Not sure if this will help, but this is what I do when I'm working with Pandas on excel files and want the date format to be 'mm/dd/yyyy' or some other.
writer = pd.ExcelWriter(filename, engine='xlsxwriter', datetime_format='mm/dd/yyyy')
df.to_excel(writer, sheetname)
Maybe it'll work with: df.to_csv
You have a few options really. I'm not entirely sure what happens when you try to directly load the file with a 'pd.read_csv' but as suggested above you can define a set of format strings that you can try to use to parse the data.
One other option would be to read the date column in as a string and then parse it yourself. If you want the column to be like 'YYYY-MM-DD' then parse the string to have just that data and then save it back, something like.
import pandas as prandas
import datetime
df = prandas.read_csv('supa_kewl_data.dis_fmt_rox', dtype={'start_date': str})
print df.head()
# we are interested in start_date
date_strs = df['start_date'].values
#YYYY-MM-DD
#012345678910
filter_date_strs = [x[0:10] for x in date_strs]
df['filter_date_strs] = filter_date_strs
# sometimes i've gotten complained at by pandas for doing this
# try doing df.loc[:,'filter_date_strs'] = filter_date_strs
# if you get some warning thing
# if you want you can convert back to date time using a
dobjs = [datetime.datetime.strptime(x,'%Y-%m-%d') for x in filter_date_strs]
df['dobj_start_date'] = dobjs
df.to_csv('even_better_data.csv', index=False)
Hopefully this helps! Pandas documentation is sketchy sometimes, looking at the doc in 0.16.2 for read_csv() is intimidating... http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html The library itself is stellar!
Unfortunately you just have to try each format it might be. If you give an example format, strptime will attempt to parse it for you as discussed here.
The code will end up looking like:
import datetime
POSSIBLE_DATE_FORMATS = ['%m/%d/%Y', '%Y/%m/%d', etc...] # all the formats the date might be in
for date_format in POSSIBLE_DATE_FORMATS :
try:
parsed_date = datetime.strptime(raw_string_date, date_format) # try to get the date
break # if correct format, don't test any other formats
except ValueError:
pass # if incorrect format, keep trying other formats