Having NA level for missing values with cut function from R

白昼怎懂夜的黑 提交于 2019-12-08 21:43:03

问题


The cut function in R omits NA. But I want to have a level for missing values. Here is my MWE.

set.seed(12345)
Y <- c(rnorm(n = 50, mean = 500, sd = 1), NA)
Y1 <-  cut(log(Y), 5)
Labs <- levels(Y1)
Labs

[1] "(6.21,6.212]"  "(6.212,6.213]" "(6.213,6.215]" "(6.215,6.217]" "(6.217,6.219]"

Desired Output

[1] "(6.21,6.212]"  "(6.212,6.213]" "(6.213,6.215]" "(6.215,6.217]" "(6.217,6.219]" "NA"

回答1:


You could use addNA

 Labs <- levels(addNA(Y1))
 Labs
#[1] "(6.21,6.212]"  "(6.212,6.213]" "(6.213,6.215]" "(6.215,6.217]"
#[5] "(6.217,6.219]" NA

In the expected output, you had character "NA". But, I think it is better to have real NA as it can be removed/replaced with is.na

 is.na(Labs)
 #[1] FALSE FALSE FALSE FALSE FALSE  TRUE



回答2:


Changing the original MWE's third line to the following stores NA (actually ) in Y1 rather than the external vector Labs. This cleans up analytic tasks like making tables or building models. The NA is also still recognized by is.na().

Y1 <-  factor(cut(log(Y), 5), exclude=NULL)
is.na(levels(Y1))

result:

[1] FALSE FALSE FALSE FALSE FALSE  TRUE


来源:https://stackoverflow.com/questions/31704687/having-na-level-for-missing-values-with-cut-function-from-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!