In answering this stackoverflow question, I found some interesting behavior when using a fill method while reindexing a dataframe.
This old bug report in pandas says that df.reindex(newIndex,method='ffill')
should be equivalent to df.reindex(newIndex).ffill()
, but that is NOT the behavior I'm witnessing
Here's a code snippet that illustrates the behavior
df = pd.DataFrame({'values': 2}, index=pd.DatetimeIndex(['2016-06-02', '2016-05-04', '2016-06-03']))
newIndex = pd.DatetimeIndex(['2016-05-04', '2016-06-01', '2016-06-02', '2016-06-03', '2016-06-05'])
print(df.reindex(newIndex).ffill())
print(df.reindex(newIndex, method='ffill'))
The first print statement works as expected. The second raises a
ValueError: index must be monotonic increasing or decreasing
What's going on here?
EDIT: Note that the sample df
intentionally has a non-monotonic index. The question pertains to the order of operations in df.reindex(newIndex, method='ffil')
. My expectation is as the bug-report says it should work- first reindex with the new index and then fill.
As you can see, the newIndex.is_monotonic
is True
, and the fill works when called separately but fails when called as a parameter to reindex
.
Some element of reindex
requires the incoming index to be sorted. I'm deducing that when method
is passed, it fails to presort the incoming index and subsequently fails. I'm drawing this conclusion based on the fact that this works:
print df.sort_index().reindex(newIndex.sort_values(), method='ffill')
It seems that this needs to be done on the columns as well.
In[76]: frame = DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'],columns=['Ohio', 'Texas', 'California'])
In[77]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states)
---> ValueError: index must be monotonic increasing or decreasing
In[78]: frame.reindex(index=['a','b','c','d'],method='ffill',columns=states.sort())
Out[78]:
Ohio Texas California
a 0 1 2
b 0 1 2
c 3 4 5
d 6 7 8
来源:https://stackoverflow.com/questions/37982170/pandas-reindex-and-fill-missing-values-index-must-be-monotonic