Python numpy: cannot convert datetime64[ns] to datetime64[D] (to use with Numba)

后端 未结 2 1099
面向向阳花
面向向阳花 2020-12-03 01:22

I want to pass a datetime array to a Numba function (which cannot be vectorised and would otherwise be very slow). I understand Numba supports numpy.datetime64. However, it

相关标签:
2条回答
  • 2020-12-03 01:38

    Ran into the same error when calculating number of business days between two dates:

    from pandas.tseries.offsets import MonthBegin
    import numpy as np 
    
    # Calculate the beginning of the month from a given date
    df['Month_Begin'] = pd.to_datetime(df['MyDateColumn'])+ MonthBegin(-1)
    
    # Calculate # of Business Days
    # Convert dates to string to prevent type error [D]
    df['TS_Period_End_Date'] = df['TS_Period_End_Date'].dt.strftime('%Y-%m-%d')
    df['Month_Begin'] = df['Month_Begin'].dt.strftime('%Y-%m-%d')
    
    df['Biz_Days'] = np.busday_count(df['Month_Begin'], df['MyDateColumn']) #<-- Error if not converted into strings.
    

    My workaround was to convert the dates using ".dt.strftime(''%Y-%m-%d')". It worked in my particular case.

    0 讨论(0)
  • 2020-12-03 01:43

    Series.astype converts all date-like objects to datetime64[ns]. To convert to datetime64[D], use values to obtain a NumPy array before calling astype:

    dates_input = df["month_15"].values.astype('datetime64[D]')
    

    Note that NDFrames (such as Series and DataFrames) can only hold datetime-like objects as objects of dtype datetime64[ns]. The automatic conversion of all datetime-likes to a common dtype simplifies subsequent date computations. But it makes it impossible to store, say, datetime64[s] objects in a DataFrame column. Pandas core developer, Jeff Reback explains,

    "We don't allow direct conversions because its simply too complicated to keep anything other than datetime64[ns] internally (nor necessary at all)."


    Also note that even though df['month_15'].astype('datetime64[D]') has dtype datetime64[ns]:

    In [29]: df['month_15'].astype('datetime64[D]').dtype
    Out[29]: dtype('<M8[ns]')
    

    when you iterate through the items in the Series, you get pandas Timestamps, not datetime64[ns]s.

    In [28]: df['month_15'].astype('datetime64[D]').tolist()
    Out[28]: [Timestamp('2010-01-15 00:00:00'), Timestamp('2011-01-15 00:00:00')]
    

    Therefore, it is not clear that Numba actually has a problem with datetime64[ns], it might just have a problem with Timestamps. Sorry, I can't check this -- I don't have Numba installed.

    However, it might be useful for you to try

    testf(df['month_15'].astype('datetime64[D]').values)
    

    since df['month_15'].astype('datetime64[D]').values is truly a NumPy array of dtype datetime64[ns]:

    In [31]: df['month_15'].astype('datetime64[D]').values.dtype
    Out[31]: dtype('<M8[ns]')
    

    If that works, then you don't have to convert everything to datetime64[D], you just have to pass NumPy arrays -- not Pandas Series -- to testf.

    0 讨论(0)
提交回复
热议问题