Pandas Rolling Window - datetime64[ns] are not implemented

会有一股神秘感。 提交于 2021-02-06 10:13:40

问题


I'm attempting to use Python/Pandas to build some charts. I have data that is sampled every second. Here is a sample:

Index, Time, Value

31362, 1975-05-07 07:59:18,  36.151612
31363, 1975-05-07 07:59:19,  36.181368
31364, 1975-05-07 07:59:20,  36.197195
31365, 1975-05-07 07:59:21,  36.151413
31366, 1975-05-07 07:59:22,  36.138009
31367, 1975-05-07 07:59:23,  36.142962
31368, 1975-05-07 07:59:24,  36.122680

I need to create a variety of windows to look at the data. 10, 100, 1000 etc. Unfortunately when I attempt to window the entire data frame I get the error below...

NotImplementedError: ops for Rolling for this dtype datetime64[ns] are not implemented

I checked out these docs: http://pandas.pydata.org/pandas-docs/stable/computation.html as a reference, and they appear to be doing this on date ranges. I did notice that the data type between what they have and what I have is different.

Is there an easy way to do this?

This is ideally what I'm trying to do:

tmp = data.rolling(window=2)
tmp.mean()

I'm using plotly to plot the raw data and then the windowed data on top of it. My goal is to find ideal windows for identifying cleaner trends in the data removing some of the noise.

Thanks!

Additional Notes:

I think I need to take my data from this format:

pandas.core.series.Series to this one:

pandas.tseries.index.DatetimeIndex


回答1:


Setup

from StringIO import StringIO
import pandas as pd

text = """Index,Time,Value
31362,1975-05-07 07:59:18,36.151612
31363,1975-05-07 07:59:19,36.181368
31364,1975-05-07 07:59:20,36.197195
31365,1975-05-07 07:59:21,36.151413
31366,1975-05-07 07:59:22,36.138009
31367,1975-05-07 07:59:23,36.142962
31368,1975-05-07 07:59:24,36.122680"""

df = pd.read_csv(StringIO(text), index_col=0, parse_dates=[1])

df.rolling(2).mean()
NotImplementedError: ops for Rolling for this dtype datetime64[ns] are not implemented

First off, this is confirmation of @BrenBarn's comment and he should get the credit if he decides to post an answer. BrenBarn, if you decide to answer, I'll delete this post.

Explanation

Pandas has no idea what a rolling mean of date values ought to be. df.rolling(2).mean() is attempting to roll and average over both the Time and Value columns. The error is politely (or impolitely, depending on your perspective) telling you that you're trying something non-sensical.

Solution

Move the Time column to the index and then... well that's it.

df.set_index('Time').rolling(2).mean()



来源:https://stackoverflow.com/questions/38415314/pandas-rolling-window-datetime64ns-are-not-implemented

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!