I have a dataframe which Im loading from a csv file and then setting the index to few of its columns (usually two or three) by the set_index
method. The idea is to
Pandas provides:
d = d.sort_index()
print d.index.is_lexsorted() # Sometimes true
which will do what you want in most cases. However, always sort the index, but may be leave it 'lexsorted' (for example, if you have NANs in the index), which generates a PerformanceWarning.
To avoid this:
d = d.sort_index(level=d.index.names)
print d.index.is_lexsorted() # true
... though why there's a difference doesn't seem to be documented.