What exactly is the lexsort_depth
of a multi-index dataframe? Why does it have to be sorted for indexing?
For example, I have noticed that, after manual
lexsort_depth is the number of levels of a multi-index that are sorted lexically. That is, in an a-b-c-1-2-3 order (normal sort order).
So element indexing will work if a multi-index is not sorted, but the lookups may be quite a bit slower (in 0.15.2, this will show a PerformanceWarning
for doing these kinds of lookups, see here
The reason that sorting in general a good idea is that pandas is able to use hash-based indexing to figure out where the location is in a particular level independently for the level. ; then you can use these indexers to find the final locations.
Pandas takes advantage of np.searchsorted
to find these locations when its sorted. If its not sorted, then you have to fallback to some different (slower) methods.
here is the code that does this.