I have a dataset with non-static date structure
Such as
Fri, 13 Apr 2018 13:13:12 +0000 (UTC)
Mon, 26 Mar 2018 06:32:59 +0100
Tue, 05 Dec 2017 11:03:
Using timestring
:
import timestring
dt_1 = "Fri, 13 Apr 2018 13:13:12 +0000 (UTC)"
dt_2 = "Mon, 26 Mar 2018 06:32:59 +0100"
dt_3 = "Tue, 05 Dec 2017 11:03:34 GMT"
dt_4 = "08 Dec 2016 12:00:24"
print(timestring.Date(dt_1))
print(timestring.Date(dt_2))
print(timestring.Date(dt_3))
print(timestring.Date(dt_4))
EDIT:
While I was at it, here is another cooler method:
Using dparser:
import dateutil.parser as dparser
dt_1 = "Fri, 13 Apr 2018 13:13:12 +0000 (UTC)"
dt_2 = "Mon, 26 Mar 2018 06:32:59 +0100"
dt_3 = "Tue, 05 Dec 2017 11:03:34 GMT"
dt_4 = "08 Dec 2016 12:00:24"
print(dparser.parse(dt_1,fuzzy=True))
print(dparser.parse(dt_2,fuzzy=True))
print(dparser.parse(dt_3,fuzzy=True))
print(dparser.parse(dt_4,fuzzy=True))
OUTPUT:
2018-04-13 13:13:12+00:00
2018-03-26 06:32:59+01:00
2017-12-05 11:03:34+00:00
2016-12-08 12:00:24
EDIT 2:
Why is dparser
cooler?
Invalid dates raise a ValueError:
invalid_dt = "Fri, 35 Apr 2018 13:13:12 +0000 (UTC)"
print(dparser.parse(invalid_dt,fuzzy=True))
OUTPUT:
ValueError: day is out of range for month
EDIT 3:
To get the day
, month
, year
, hour
, minute
or second
:
print(dparser.parse(dt_1,fuzzy=True).day) # 13
print(dparser.parse(dt_2,fuzzy=True).month) # 3
print(dparser.parse(dt_3,fuzzy=True).year) # 2017
print(dparser.parse(dt_4,fuzzy=True).hour) # 12
print(dparser.parse(dt_4,fuzzy=True).minute) # 0
print(dparser.parse(dt_4,fuzzy=True).second) # 24
EDIT 4:
If you want to get the name of the Day:
print(datetime.date(dparser.parse(dt_1,fuzzy=True)).strftime("%a")) # Fri