问题
I have a data frame called df1 with a 2-level MultiIndex (levels: '_Date' and _'ItemId'). There are multiple instances of each value of '_ItemId', like this:
_SomeOtherLabel
_Date _ItemId
2014-10-05 6588921 AA
6592520 AB
6836143 BA
2014-10-11 6588921 CA
6592520 CB
6836143 DA
I have a second data frame called df2 with '_ItemId' used as a key (not the index). In this df, there is only one occurrence of each value of _ItemId:
_ItemId _Cat
0 6588921 6_1
1 6592520 6_1
2 6836143 7_1
I want to recover the values in the column '_Cat' from df2 and merge them into df1 for the appropriate values of '_ItemId'. This is almost (I think?) a standard many-to-one merge, except that the appropriate key for the left df is one of MultiIndex levels. I tried this:
df1['_cat']=pd.merge(df1,df2,left_index=True, right_on='ItemId')
but I get the error
"ValueError: len(right_on) must equal the number of levels in the index of "left"
which I suppose makes sense since my (left) index is actually made of two keys. How do I select the one index level that I need? Or is there a better approach to this merge?
Thanks
回答1:
I could think of 2 ways of doing this.
use set_index() and join():
>>> df1.join(df2.set_index('_ItemId'))
_SomeOtherLabel _Cat
_Date _ItemId
2014-10-05 6588921 AA 6_1
6592520 AB 6_1
6836143 BA 7_1
2014-10-11 6588921 CA 6_1
6592520 CB 6_1
6836143 DA 7_1
or use reset_index()
, merge()
and then set new multiindex
I think first approach should be faster, but not sure.
来源:https://stackoverflow.com/questions/27251301/pandas-merge-with-multiindex-when-only-one-level-of-index-is-to-be-used-as-key