recoding variables in R with a lookup table

纵饮孤独 提交于 2019-12-10 00:01:07

问题


I have a question about recoding data. I would like to use a lookup table and I am wondering how to recode NA and use an approach similar to %in%.

Sample data:

gender <- c("Female", "Not Disclosed", "Unknown" , "Male", "Male", "Female", NA)
df_gender <- as.data.frame(gender)
df_gender$gender <- as.character(gender)

My first approach to recode is:

df_gender$gender[df_gender$gender == "Female"] <- "F"
df_gender$gender[df_gender$gender == "Male"] <- "M"
df_gender$gender[df_gender$gender %in% c("Unknown", "Not Disclosed", NA)] <- "Missing"

This approach works appropriately. However, it is tedious when there are lots of variables and can lead to a lot of lines of code. I would like to use a lookup table such as the other approach I tried:

df_gender2 <- as.data.frame(gender)
df_gender2$gender <- as.character(gender)

gender_lookup <- c(Female = "F", Male = "M", Unknown = "Missing", "Not Disclosed" = "Missing")
df_gender2$gender <- gender_lookup[df_gender2$gender]

This works, but does not recode NA to missing. Is there a way to combine "Not Disclosed" and "Unknown" to set it equal to "Missing" without typing them separately? Second, using a lookup table, is there a way to also recode NA to "Missing"?

来源:https://stackoverflow.com/questions/42823269/recoding-variables-in-r-with-a-lookup-table

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!