How to deal with multiple date string formats in a python series

后端 未结 3 400
轮回少年
轮回少年 2021-01-05 17:41

I have a csv file which I am trying to complete operations on. I have created a dataframe with one column titled \"start_date\" which has the date of warranty start. The pro

相关标签:
3条回答
  • 2021-01-05 17:54

    Not sure if this will help, but this is what I do when I'm working with Pandas on excel files and want the date format to be 'mm/dd/yyyy' or some other.

    writer = pd.ExcelWriter(filename, engine='xlsxwriter', datetime_format='mm/dd/yyyy')
    df.to_excel(writer, sheetname)
    

    Maybe it'll work with: df.to_csv

    0 讨论(0)
  • 2021-01-05 17:56

    You have a few options really. I'm not entirely sure what happens when you try to directly load the file with a 'pd.read_csv' but as suggested above you can define a set of format strings that you can try to use to parse the data.

    One other option would be to read the date column in as a string and then parse it yourself. If you want the column to be like 'YYYY-MM-DD' then parse the string to have just that data and then save it back, something like.

    import pandas as prandas
    import datetime
    
    df = prandas.read_csv('supa_kewl_data.dis_fmt_rox', dtype={'start_date': str})
    
    print df.head()
    # we are interested in start_date
    
    date_strs = df['start_date'].values
    #YYYY-MM-DD
    #012345678910
    filter_date_strs = [x[0:10] for x in date_strs]
    df['filter_date_strs] = filter_date_strs
    
    # sometimes i've gotten complained at by pandas for doing this
    # try doing df.loc[:,'filter_date_strs'] = filter_date_strs
    # if you get some warning thing
    
    # if you want you can convert back to date time using a 
    dobjs = [datetime.datetime.strptime(x,'%Y-%m-%d') for x in filter_date_strs]
    df['dobj_start_date'] = dobjs
    
    df.to_csv('even_better_data.csv', index=False)
    

    Hopefully this helps! Pandas documentation is sketchy sometimes, looking at the doc in 0.16.2 for read_csv() is intimidating... http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html The library itself is stellar!

    0 讨论(0)
  • 2021-01-05 18:07

    Unfortunately you just have to try each format it might be. If you give an example format, strptime will attempt to parse it for you as discussed here.

    The code will end up looking like:

    import datetime    
    
    POSSIBLE_DATE_FORMATS = ['%m/%d/%Y', '%Y/%m/%d', etc...] # all the formats the date might be in
    
    for date_format in POSSIBLE_DATE_FORMATS :
        try:
            parsed_date = datetime.strptime(raw_string_date, date_format) # try to get the date
            break # if correct format, don't test any other formats
        except ValueError:
            pass # if incorrect format, keep trying other formats
    
    0 讨论(0)
提交回复
热议问题