converting a string to np.array with datetime64, NOT using Pandas

心不动则不痛 提交于 2020-01-02 09:37:51


I'm looking for a way to convert dates given in the format YYYYmmdd to an np.array with dtype='datetime64'. The dates are stored in another np.array but with dtype='float64'.

I am looking for a way to achieve this by avoiding Pandas!

I already tried something similar as suggested in this answer but the author states that "[...] if (the date format) was in ISO 8601 you could parse it directly using numpy, [...]".

As the date format in my case is YYYYmmdd which IS(?) ISO 8601 it should be somehow possible to parse it directly using numpy. But I don't know how as I am a total beginner in python and coding in general.

I really try to avoid Pandas because I don't want to bloat my script when there is a way to get the task done by using the modules I am already using. I also read it would decrease the speed here.


If noone else comes up with something more builtin, here is a pedestrian method:

>>> dates
array([19700101., 19700102., 19700103., 19700104., 19700105., 19700106.,
       19700107., 19700108., 19700109., 19700110., 19700111., 19700112.,
       19700113., 19700114.])
>>> y, m, d = dates.astype(int) // np.c_[[10000, 100, 1]] % np.c_[[10000, 100, 100]]
>>> y.astype('U4').astype('M8') + (m-1).astype('m8[M]') + (d-1).astype('m8[D]')
array(['1970-01-01', '1970-01-02', '1970-01-03', '1970-01-04',
       '1970-01-05', '1970-01-06', '1970-01-07', '1970-01-08',
       '1970-01-09', '1970-01-10', '1970-01-11', '1970-01-12',
       '1970-01-13', '1970-01-14'], dtype='datetime64[D]')


You can go via the python datetime module.

from datetime import datetime
import numpy as np

datestrings = np.array(["18930201", "19840404"])
dtarray = np.array([datetime.strptime(d, "%Y%m%d") for d in datestrings], dtype="datetime64[D]")

# out: ['1893-02-01' '1984-04-04'] datetime64[D]

Since the real question seems to be how to get the given strings into the matplotlib datetime format,

from datetime import datetime
import numpy as np
from matplotlib import dates as mdates

datestrings = np.array(["18930201", "19840404"])
mpldates = mdates.datestr2num(datestrings)

# out: [691071. 724370.]

