How do I parse an ISO 8601-formatted date?

后端 未结 27 2397
小鲜肉
小鲜肉 2020-11-21 06:08

I need to parse RFC 3339 strings like \"2008-09-03T20:56:35.450686Z\" into Python\'s datetime type.

I have found strptime in the Python sta

27条回答
  •  感情败类
    2020-11-21 06:36

    Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. If you want to use strptime, you need to strip out those variations first.

    The goal is to generate a utc datetime object.


    If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:

    datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")
    


    If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.

    import re
    # this regex removes all colons and all 
    # dashes EXCEPT for the dash indicating + or - utc offset for the timezone
    conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
    datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )
    


    If your system does not support the %z strptime directive (you see something like ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z') then you need to manually offset the time from Z (UTC). Note %z may not work on your system in python versions < 3 as it depended on the c library support which varies across system/python build type (i.e. Jython, Cython, etc.).

    import re
    import datetime
    
    # this regex removes all colons and all 
    # dashes EXCEPT for the dash indicating + or - utc offset for the timezone
    conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
    
    # split on the offset to remove it. use a capture group to keep the delimiter
    split_timestamp = re.split(r"[+|-]",conformed_timestamp)
    main_timestamp = split_timestamp[0]
    if len(split_timestamp) == 3:
        sign = split_timestamp[1]
        offset = split_timestamp[2]
    else:
        sign = None
        offset = None
    
    # generate the datetime object without the offset at UTC time
    output_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )
    if offset:
        # create timedelta based on offset
        offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
        # offset datetime with timedelta
        output_datetime = output_datetime + offset_delta
    

提交回复
热议问题