I have used the below tapply function to get the median of Age based on Pclass.
tapply
Now how can I impute those median values to NA values based on Pclass?
Try the following.
set.seed(1) df1 <- data.frame(Pclass = sample(1:3, 20, TRUE), Age = sample(c(NA, 20:40), 20, TRUE, prob = c(10, rep(1, 21)))) new <- ave(df1$Age, df1$Pclass, FUN = function(x) median(x, na.rm = TRUE)) df1$Age[is.na(df1$Age)] <- new[is.na(df1$Age)]
Final clean up.
rm(new)