Function to replace NaN values in a dataframe with mean of the related column

后端 未结 3 344
甜味超标
甜味超标 2021-01-14 21:52

EDIT: This question is not a clone of pandas dataframe replace nan values with average of columns because I want to replace the value of each column with th

相关标签:
3条回答
  • 2021-01-14 22:12

    You can also use fillna

    df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [2, np.nan, np.nan]})
    df.fillna(df.mean(axis=0))
        A   B
    0   1.0 2.0
    1   2.0 2.0
    2   1.5 2.0
    

    df.mean(axis=0) computes the mean for every column, and this is passed to the fillna method.

    This solution is on my machine, twice as fast as the solution using apply for the data set shown above.

    0 讨论(0)
  • 2021-01-14 22:14

    You can try something like:

    [df[col].fillna(df[col].mean(), inplace=True) for col in df.columns]
    

    But that is just a way to do it. Your code is a priori almost correct. Your error is that you should call

    train[value]
    

    Instead of :

    train['value']
    

    Everywhere in your code. Because the latter will try to look for a column named as "value" which is rather a variable from a list you are iterating on.

    0 讨论(0)
  • 2021-01-14 22:26

    To fill NaN of each column with its respective mean use:

    df.apply(lambda x: x.fillna(x.mean())) 
    
    0 讨论(0)
提交回复
热议问题