Pandas DataFrame - Replace NULL String with Blank and NULL Numeric with 0

前端 未结 2 2021

I am working on a large dataset with many columns of different types. There are a mix of numeric values and strings with some NULL values. I need to change the NULL Value to

2条回答
  •  栀梦
    栀梦 (楼主)
    2021-01-14 01:49

    Use DataFrame.select_dtypes for numeric columns, filter by subset and replace values to 0, then repalce all another columns to empty string:

    print (df)
       0     1    2    3  4     5    6       7   8      9
    0  1  John  2.0  Doe  3  Mike  4.0  Orange   5  Stuff
    1  9   NaN  NaN  NaN  8   NaN  NaN   Lemon  12    NaN
    
    print (df.dtypes)
    0      int64
    1     object
    2    float64
    3     object
    4      int64
    5     object
    6    float64
    7     object
    8      int64
    9     object
    dtype: object
    
    c = df.select_dtypes(np.number).columns
    df[c] = df[c].fillna(0)
    df = df.fillna("")
    print (df)
       0     1    2    3  4     5    6       7   8      9
    0  1  John  2.0  Doe  3  Mike  4.0  Orange   5  Stuff
    1  9        0.0       8        0.0   Lemon  12       
    

    Another solution is create dictionary for replace:

    num_cols = df.select_dtypes(np.number).columns
    d1 = dict.fromkeys(num_cols, 0)
    d2 = dict.fromkeys(df.columns.difference(num_cols), "")
    
    d  = {**d1,  **d2}
    print (d)
    {0: 0, 2: 0, 4: 0, 6: 0, 8: 0, 1: '', 3: '', 5: '', 7: '', 9: ''}
    
    df = df.fillna(d)
    print (df)
       0     1    2    3  4     5    6       7   8      9
    0  1  John  2.0  Doe  3  Mike  4.0  Orange   5  Stuff
    1  9        0.0       8        0.0   Lemon  12       
    

提交回复
热议问题