df1
has missing values:
df1=
ID age
1 12
2 na
3 23
4 na
5 na
6 na
Check where the nulls are and then impute the value at those places.
miss_bool = df1.age.isnull()
df2 = df2.set_index('Id')
df1.loc[miss_bool, 'age'] = df1.loc[miss_bool, 'Id'].apply(lambda x: df2.age[x])
You can set ID
as index for both dataframes, and then use the fillna()
method, which fill missing values, while matching the index of the two dataframes:
df1.set_index("ID").age.fillna(df2.set_index("ID").age).reset_index()
# ID age
#0 1 12
#1 2 4
#2 3 23
#3 4 5
#4 5 6
#5 6 7
Another option is, combine_first
, which takes values from the first dataframe, if not null
, otherwise takes values from the second dataframe with index and columns matched:
df1.set_index("ID").combine_first(df2.set_index("ID")).reset_index()
# ID age
#0 1 12.0
#1 2 4.0
#2 3 23.0
#3 4 5.0
#4 5 6.0
#5 6 7.0