Pandas reindex and fill missing values: “Index must be monotonic”

末鹿安然 提交于 2019-11-28 01:37:58

问题


In answering this stackoverflow question, I found some interesting behavior when using a fill method while reindexing a dataframe.

This old bug report in pandas says that df.reindex(newIndex,method='ffill') should be equivalent to df.reindex(newIndex).ffill(), but that is NOT the behavior I'm witnessing

Here's a code snippet that illustrates the behavior

df = pd.DataFrame({'values': 2}, index=pd.DatetimeIndex(['2016-06-02', '2016-05-04', '2016-06-03']))
newIndex = pd.DatetimeIndex(['2016-05-04', '2016-06-01', '2016-06-02', '2016-06-03', '2016-06-05'])
print(df.reindex(newIndex).ffill())
print(df.reindex(newIndex, method='ffill'))

The first print statement works as expected. The second raises a

ValueError: index must be monotonic increasing or decreasing

What's going on here?


EDIT: Note that the sample df intentionally has a non-monotonic index. The question pertains to the order of operations in df.reindex(newIndex, method='ffil'). My expectation is as the bug-report says it should work- first reindex with the new index and then fill.

As you can see, the newIndex.is_monotonic is True, and the fill works when called separately but fails when called as a parameter to reindex.


回答1:


Some element of reindex requires the incoming index to be sorted. I'm deducing that when method is passed, it fails to presort the incoming index and subsequently fails. I'm drawing this conclusion based on the fact that this works:

print df.sort_index().reindex(newIndex.sort_values(), method='ffill')



回答2:


It seems that this needs to be done on the columns as well.

In[76]: frame = DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'],columns=['Ohio', 'Texas', 'California'])

In[77]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states)
---> ValueError: index must be monotonic increasing or decreasing

In[78]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states.sort())

Out[78]:
  Ohio  Texas  California
a     0      1           2
b     0      1           2
c     3      4           5
d     6      7           8



来源:https://stackoverflow.com/questions/37982170/pandas-reindex-and-fill-missing-values-index-must-be-monotonic

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!