Pandas DatetimeIndex indexing dtype: datetime64 vs Timestamp

我的未来我决定 提交于 2019-12-12 20:07:51

问题


Indexing a pandas DatetimeIndex (with dtype numpy datetime64[ns]) returns either:

  • another DatetimeIndex for multiple indices
  • a pandas Timestamp for single index

The confusing part is that Timestamps do not equal np.datetime64, so that:

import numpy as np
import pandas as pd

a_datetimeindex = pd.date_range('1/1/2016', '1/2/2016', freq = 'D')
print np.in1d(a_datetimeindex[0], a_datetimeindex)

Returns false. But:

print np.in1d(a_datetimeindex[0:1], a_datetimeindex)
print np.in1d(np.datetime64(a_datetimeindex[0]), a_datetimeindex)

Returns the right results.

I guess that is because np.datetime64[ns] has accuracy to the nanosecond, but the Timestamp is truncated?

My question is, is there a way to create the DatetimeIndex so that it always indexes to the same (or comparable) data type?


回答1:


You are using numpy functions to manipulate pandas types. They are not always compatible.

The function np.in1d first converts its both arguments to ndarrays. A DatetimeIndex has a built-in conversion and an array of dtype np.datetime64 is returned (it's DatetimIndex.values). But a Timestamp doesn't have such a facility and it's not converted.

Instead, you can use for example a python keyword in (the most natural way):

a_datetimeindex[0] in a_datetimeindex

or an Index.isin method for a collection of elements

a_datetimeindex.isin(a_list_or_index)

If you want to use np.in1d, explicitly convert both arguments to numpy types. Or call it on the underlying numpy arrays:

np.in1d(a_datetimeindex.values[0], a_datetimeindex.values)

Alternatively, it's probably safe to use np.in1d with two collections of the same type:

np.in1d(a_datetimeindex, another_datetimeindex)

or even

np.in1d(a_datetimeindex[[0]], a_datetimeindex)


来源:https://stackoverflow.com/questions/38676418/pandas-datetimeindex-indexing-dtype-datetime64-vs-timestamp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!