Ignore NA's in sapply function

怎甘沉沦 提交于 2019-12-10 15:09:22

问题


I am using R and have searched around for an answer but while I have seen similar questions, it has not worked for my specific problem.

In my data set I am trying to use the NA's as placeholders because I am going to return to them once I get part of my analysis done so therefore, I would like to be able to do all my calculations as if the NA's weren't really there.

Here's my issue with an example data table

ROCA = c(1,3,6,2,1,NA,2,NA,1,NA,4,NA)
ROCA <- data.frame (ROCA=ROCA)       # converting it just because that is the format of my original data

#Now my function
exceedes <- function (L=NULL, R=NULL, na.rm = T)
 {
    if (is.null(L) | is.null(R)) {
        print ("mycols: invalid L,R.")
        return (NULL)               
    }
    test <-(mean(L, na.rm=TRUE)-R*sd(L,na.rm=TRUE))
  test1 <- sapply(L,function(x) if((x)> test){1} else {0})
  return (test1)
}
L=ROCA[,1]
R=.5
ROCA$newcolumn <- exceedes(L,R)
names(ROCA)[names(ROCA)=="newcolumn"]="Exceedes1"

I am getting the error:

Error in if ((x) > test) { : missing value where TRUE/FALSE needed 

As you guys know, it is something wrong with the sapply function. Any ideas on how to ignore those NA's? I would try na.omit if I could get it to insert all the NA's right where they were before, but I am not sure how to do that.


回答1:


This statement is strange:

test1 <- sapply(L,function(x) if((x)> test){1} else {0})

Try:

test1 <- ifelse(is.na(L), NA, ifelse(L > test, 1, 0))



回答2:


There's no need for sapply and your anonymous function because > is already vectorized.

It also seems really odd to specify default argument values that are invalid. My guess is that you're using that as a kludge instead of using the missing function. It's also good practice to throw an error rather than return NULL because you would still have to try to catch when the function returns NULL.

exceedes <- function (L, R, na.rm=TRUE)
{
  if(missing(L) || missing(R)) {
    stop("L and R must be provided")
  }
  test <- mean(L,na.rm=TRUE)-R*sd(L,na.rm=TRUE)
  as.numeric(L > test)
}

ROCA <- data.frame(ROCA=c(1,3,6,2,1,NA,2,NA,1,NA,4,NA))
ROCA$Exceeds1 <- exceedes(ROCA[,1],0.5)



回答3:


Do you want NA:s in the result? That is, do you want the rows to line up?

seems like just returning L > test would work then. And adding the column can be simplified too (I suspect "Exeedes1" is in a variable somewhere).

exceedes <- function (L=NULL, R=NULL, na.rm = T)
 {
    if (is.null(L) | is.null(R)) {
        print ("mycols: invalid L,R.")
        return (NULL)               
    }
    test <-(mean(L, na.rm=TRUE)-R*sd(L,na.rm=TRUE))

    L > test
}
L=ROCA[,1]
R=.5
ROCA[["Exceedes1"]] <- exceedes(L,R)


来源:https://stackoverflow.com/questions/6499905/ignore-nas-in-sapply-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!