Filling na values with merge from another dataframe

扶醉桌前 提交于 2021-02-18 12:11:07

问题


I have a column with na values that I want to fill according to values from another data frame according to a key. I was wondering if there is any simple way to do so.

Example: I have a data frame of objects and their colors like this:

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball     **NaN**
4  chair   white
5  chair     **NaN**
6   ball    grey

I want to fill na values in the color column with default color from the following data frame:

  object default_color
0  chair         brown
1   ball          blue
2   door          grey

So the result will be this:

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball     **blue**
4  chair   white
5  chair     **brown**
6   ball    grey

Is there any "easy" way to do this?

Thanks :)


回答1:


First create Series and then replace NaNs:

s = df1['object'].map(df2.set_index('object')['default_color'])
print (s)
0    brown
1     blue
2     grey
3     blue
4    brown
5    brown
6     blue
Name: object, dtype: object
df1['color']= df1['color'].mask(df1['color'].isnull(), s)

Or:

df1.loc[df1['color'].isnull(), 'color'] = s

Or:

df1['color'] = df1['color'].combine_first(s)

Or:

df1['color'] = df1['color'].fillna(s)

print (df1)
  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball    blue
4  chair   white
5  chair   brown
6   ball    grey

If unique values in object:

df = df1.set_index('object')['color']
        .combine_first(df2.set_index('object')['default_color'])
        .reset_index()

Or:

df = df1.set_index('object')['color']
        .fillna(df2.set_index('object')['default_color'])
        .reset_index()



回答2:


Use np.where and mapping by setting a column as index i.e

df['color']= np.where(df['color'].isnull(),df['object'].map(df2.set_index('object')['default_color']),df['color'])

or df.where

df['color'] = df['color'].where(df['color'].notnull(), df['object'].map(df2.set_index('object')['default_color'])) 
 object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball    blue
4  chair   white
5  chair   brown
6   ball    grey



回答3:


Using loc + map:

m = df.color.isnull()
df.loc[m, 'color'] = df.loc[m, 'object'].map(df2.set_index('object').default_color)

df

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball    blue
4  chair   white
5  chair   brown
6   ball    grey

If you're going to be doing a lot of these replacements, you should call set_index on df2 just once and save its result.



来源:https://stackoverflow.com/questions/47176420/filling-na-values-with-merge-from-another-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!