问题
Indexing a pandas DatetimeIndex (with dtype numpy datetime64[ns]) returns either:
- another DatetimeIndex for multiple indices
- a pandas Timestamp for single index
The confusing part is that Timestamps do not equal np.datetime64, so that:
import numpy as np
import pandas as pd
a_datetimeindex = pd.date_range('1/1/2016', '1/2/2016', freq = 'D')
print np.in1d(a_datetimeindex[0], a_datetimeindex)
Returns false. But:
print np.in1d(a_datetimeindex[0:1], a_datetimeindex)
print np.in1d(np.datetime64(a_datetimeindex[0]), a_datetimeindex)
Returns the right results.
I guess that is because np.datetime64[ns] has accuracy to the nanosecond, but the Timestamp is truncated?
My question is, is there a way to create the DatetimeIndex so that it always indexes to the same (or comparable) data type?
回答1:
You are using numpy functions to manipulate pandas types. They are not always compatible.
The function np.in1d
first converts its both arguments to ndarrays. A DatetimeIndex
has a built-in conversion and an array of dtype np.datetime64
is returned (it's DatetimIndex.values
). But a Timestamp
doesn't have such a facility and it's not converted.
Instead, you can use for example a python keyword in
(the most natural way):
a_datetimeindex[0] in a_datetimeindex
or an Index.isin
method for a collection of elements
a_datetimeindex.isin(a_list_or_index)
If you want to use np.in1d
, explicitly convert both arguments to numpy types. Or call it on the underlying numpy arrays:
np.in1d(a_datetimeindex.values[0], a_datetimeindex.values)
Alternatively, it's probably safe to use np.in1d
with two collections of the same type:
np.in1d(a_datetimeindex, another_datetimeindex)
or even
np.in1d(a_datetimeindex[[0]], a_datetimeindex)
来源:https://stackoverflow.com/questions/38676418/pandas-datetimeindex-indexing-dtype-datetime64-vs-timestamp