How to deal with multiple date string formats in a python series

后端 未结 3 396
轮回少年
轮回少年 2021-01-05 17:41

I have a csv file which I am trying to complete operations on. I have created a dataframe with one column titled \"start_date\" which has the date of warranty start. The pro

3条回答
  •  孤街浪徒
    2021-01-05 17:56

    You have a few options really. I'm not entirely sure what happens when you try to directly load the file with a 'pd.read_csv' but as suggested above you can define a set of format strings that you can try to use to parse the data.

    One other option would be to read the date column in as a string and then parse it yourself. If you want the column to be like 'YYYY-MM-DD' then parse the string to have just that data and then save it back, something like.

    import pandas as prandas
    import datetime
    
    df = prandas.read_csv('supa_kewl_data.dis_fmt_rox', dtype={'start_date': str})
    
    print df.head()
    # we are interested in start_date
    
    date_strs = df['start_date'].values
    #YYYY-MM-DD
    #012345678910
    filter_date_strs = [x[0:10] for x in date_strs]
    df['filter_date_strs] = filter_date_strs
    
    # sometimes i've gotten complained at by pandas for doing this
    # try doing df.loc[:,'filter_date_strs'] = filter_date_strs
    # if you get some warning thing
    
    # if you want you can convert back to date time using a 
    dobjs = [datetime.datetime.strptime(x,'%Y-%m-%d') for x in filter_date_strs]
    df['dobj_start_date'] = dobjs
    
    df.to_csv('even_better_data.csv', index=False)
    

    Hopefully this helps! Pandas documentation is sketchy sometimes, looking at the doc in 0.16.2 for read_csv() is intimidating... http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html The library itself is stellar!

提交回复
热议问题