How do I replace all NA with mean in R? [duplicate]

问题

I have over 1500 columns in my dataset and 100+ of them contains at least one NA. I know I can replace NAs in a single column by

d$var[is.na(d$var)] <- mean(d$var, na.rm=TRUE)

but how do I do this too ALL the NAs in my dataset?

Thank you!

回答1:

We can use na.aggregate from zoo. Loop through the columns of dataset (assuming all the columns are numeric ), apply the na.aggregate to replace the NA with mean values (by default) and assign it back to the dataset.

library(zoo)
df[] <- lapply(df, na.aggregate)

By default, the FUN argument of na.aggregate is mean:

Default S3 method:

na.aggregate(object, by = 1, ..., FUN = mean, na.rm = FALSE, maxgap = Inf)

To do this nondestructively:

df2 <- df
df2[] <- lapply(df2, na.aggregate)

or in one line:

df2 <- replace(df, TRUE, lapply(df, na.aggregate))

If there are non-numeric columns, do this only for the numeric columns by creating a logical index first

ok <- sapply(df, is.numeric)
df[ok] <- lapply(df[ok], na.aggregate)

来源：https://stackoverflow.com/questions/41195485/how-do-i-replace-all-na-with-mean-in-r

标签

missing-data

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!