What are the most common pandas ways to select/filter rows of a dataframe whose index is a MultiIndex?
Recently I came across a use case where I had a 3+ level multi-index dataframe in which I couldn't make any of the solutions above produce the results I was looking for. It's quite possible that the above solutions do of course work for my use case, and I tried several, however I was unable to get them to work with the time I had available.
I am far from expert, but I stumbled across a solution that was not listed in the comprehensive answers above. I offer no guarantee that the solutions are in any way optimal.
This is a different way to get a slightly different result to Question #6 above. (and likely other questions as well)
Specifically I was looking for:
As a monkey wrench in the gears (however totally fixable):
On the toy dataframe below:
index = pd.MultiIndex.from_product([['a','b'],
['stock1','stock2','stock3'],
['price','volume','velocity']])
df = pd.DataFrame([1,2,3,4,5,6,7,8,9,
10,11,12,13,14,15,16,17,18],
index)
0
a stock1 price 1
volume 2
velocity 3
stock2 price 4
volume 5
velocity 6
stock3 price 7
volume 8
velocity 9
b stock1 price 10
volume 11
velocity 12
stock2 price 13
volume 14
velocity 15
stock3 price 16
volume 17
velocity 18
Using the below works, of course:
df.xs(('stock1', 'velocity'), level=(1,2))
0
a 3
b 12
But I wanted a different result, so my method to get that result was:
df.iloc[df.index.isin(['stock1'], level=1) &
df.index.isin(['velocity'], level=2)]
0
a stock1 velocity 3
b stock1 velocity 12
And if I wanted two+ values from one level and a single (or 2+) value from another level:
df.iloc[df.index.isin(['stock1','stock3'], level=1) &
df.index.isin(['velocity'], level=2)]
0
a stock1 velocity 3
stock3 velocity 9
b stock1 velocity 12
stock3 velocity 18
The above method is probably a bit clunky, however I found it filled my needs and as a bonus was easier for me to understand and read.