Conditionally filling blank values in Pandas dataframes

后端 未结 3 1268
误落风尘
误落风尘 2021-01-22 20:41

I have a datafarme which looks like as follows (there are more columns having been dropped off):

    memberID    shipping_country    
    264991      
    264991         


        
3条回答
  •  -上瘾入骨i
    2021-01-22 21:07

    You can use chained groupbys, one with forward fill and one with backfill:

    # replace blank values with `NaN` first:
    df['shipping_country'].replace('',pd.np.nan,inplace=True)
    
    df.iloc[::-1].groupby('memberID').ffill().groupby('memberID').bfill()
    
       memberID shipping_country
    0    264991           Canada
    1    264991           Canada
    2       100              USA
    3      5000               UK
    4      5000               UK
    

    This method will also allow a group made up of all NaN to remain NaN:

    >>> df
       memberID shipping_country
    0    264991                 
    1    264991           Canada
    2       100              USA
    3      5000                 
    4      5000               UK
    5         1                 
    6         1                 
    
    df['shipping_country'].replace('',pd.np.nan,inplace=True)
    
    df.iloc[::-1].groupby('memberID').ffill().groupby('memberID').bfill()
    
       memberID shipping_country
    0    264991           Canada
    1    264991           Canada
    2       100              USA
    3      5000               UK
    4      5000               UK
    5         1              NaN
    6         1              NaN
    

提交回复
热议问题