Pandas Interpolate returning ValueErrors for some methods and some sizes of dataframes

问题

I am having some issues with interpolation of a Pandas dataframe.

Basically, I have a dataframe of 295339 rows and have artificially generated nan's to study different sampling rates and completion methods.

The issue is that when I do some combinations of my sampling rates and completion methods it all works out while for others I get the following error message,

ValueError: The number of derivatives at boundaries does not match: expected. 1, got 0+0.

The type of ValueError depends on the combination of sampling rate and completion method I'm using.

So for example, if I make one nan per hour per customer and then interpolate using either the linear or the cubic method it works. But if I sample once every four hours per customer it works for the linear method but not for the cubic method (code for the interpolation bellow):

latitude = my_frame.filter(['Customer_id', 'Lat'], axis=1)
latitude = latitude.groupby('Customer_id').apply(lambda group: group.interpolate(method= 'cubic')

The weird thing is that during my tests I limited my approach to 3 customers (representing 8500 rows) for speed purposes and no issues were raised.

So, my question is why does this happen and is there any workaround.

回答1:

I found that the issue was that for customers with fewer records I wasn't capable to interpolate using the cubic method because they did not have at least 4 known points.

来源：https://stackoverflow.com/questions/57412489/pandas-interpolate-returning-valueerrors-for-some-methods-and-some-sizes-of-data

标签

python

pandas

interpolation

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!