问题
I have a column with na values that I want to fill according to values from another data frame according to a key. I was wondering if there is any simple way to do so.
Example: I have a data frame of objects and their colors like this:
object color
0 chair black
1 ball yellow
2 door brown
3 ball **NaN**
4 chair white
5 chair **NaN**
6 ball grey
I want to fill na values in the color column with default color from the following data frame:
object default_color
0 chair brown
1 ball blue
2 door grey
So the result will be this:
object color
0 chair black
1 ball yellow
2 door brown
3 ball **blue**
4 chair white
5 chair **brown**
6 ball grey
Is there any "easy" way to do this?
Thanks :)
回答1:
First create Series and then replace NaN
s:
s = df1['object'].map(df2.set_index('object')['default_color'])
print (s)
0 brown
1 blue
2 grey
3 blue
4 brown
5 brown
6 blue
Name: object, dtype: object
df1['color']= df1['color'].mask(df1['color'].isnull(), s)
Or:
df1.loc[df1['color'].isnull(), 'color'] = s
Or:
df1['color'] = df1['color'].combine_first(s)
Or:
df1['color'] = df1['color'].fillna(s)
print (df1)
object color
0 chair black
1 ball yellow
2 door brown
3 ball blue
4 chair white
5 chair brown
6 ball grey
If unique values in object
:
df = df1.set_index('object')['color']
.combine_first(df2.set_index('object')['default_color'])
.reset_index()
Or:
df = df1.set_index('object')['color']
.fillna(df2.set_index('object')['default_color'])
.reset_index()
回答2:
Use np.where
and mapping by setting a column as index i.e
df['color']= np.where(df['color'].isnull(),df['object'].map(df2.set_index('object')['default_color']),df['color'])
or df.where
df['color'] = df['color'].where(df['color'].notnull(), df['object'].map(df2.set_index('object')['default_color']))
object color 0 chair black 1 ball yellow 2 door brown 3 ball blue 4 chair white 5 chair brown 6 ball grey
回答3:
Using loc
+ map
:
m = df.color.isnull()
df.loc[m, 'color'] = df.loc[m, 'object'].map(df2.set_index('object').default_color)
df
object color
0 chair black
1 ball yellow
2 door brown
3 ball blue
4 chair white
5 chair brown
6 ball grey
If you're going to be doing a lot of these replacements, you should call set_index
on df2
just once and save its result.
来源:https://stackoverflow.com/questions/47176420/filling-na-values-with-merge-from-another-dataframe