I have a datafarme which looks like as follows (there are more columns having been dropped off):
memberID shipping_country
264991
264991
For the following sample dataframe (I added a memberID
group that only contains ''
in the shipping_country
column):
memberID shipping_country
0 264991
1 264991 Canada
2 100 USA
3 5000
4 5000 UK
5 54
This should work for you, and also as the behavior that if a memberID
group only has empty string values (''
) in shipping_country
, those will be retained in the output df
:
df['shipping_country'] = df.replace('',np.nan).groupby('memberID')['shipping_country'].transform('first').fillna('')
Yields:
memberID shipping_country
0 264991 Canada
1 264991 Canada
2 100 USA
3 5000 UK
4 5000 UK
5 54
If you would like to leave the empty strings ''
as NaN
in the output df
, then just remove the fillna('')
, leaving:
df['shipping_country'] = df.replace('',np.nan).groupby('memberID')['shipping_country'].transform('first')