I am a grad student using R and have been reading the other Stack Overflow answers regarding removing rows that contain NA from dataframes. I have tried both na.omit and comple
The error is that you actually didn't assign the output from na.omit
!
Perios <- na.omit(Perios)
If you know which column the NAs occur in, then you can just do
Perios[!is.na(Perios$Periostitis),]
or more generally:
Perios[!is.na(Perios$colA) & !is.na(Perios$colD) & ... ,]
Then as a general safety tip for R, throw in an na.fail to assert it worked:
na.fail(Perios) # trust, but verify! Die Paranoia ist gesund.
is.na
is not the proper function. You want complete.cases
and you want complete.cases
which is the equivalent of function(x) apply(is.na(x), 1, all)
or na.omit
to filter the data:
That is, you want all rows where there are no NA
values.
< x <- data.frame(a=c(1,2,NA), b=c(3,NA,NA))
> x
a b
1 1 3
2 2 NA
3 NA NA
> x[complete.cases(x),]
a b
1 1 3
> na.omit(x)
a b
1 1 3
Then this is assigned back to x
to save the data.
complete.cases
returns a vector, one element per row of the input data frame. On the other hand, is.na
returns a matrix. This is not appropriate for returning complete cases, but can return all non-NA values as a vector:
> is.na(x)
a b
[1,] FALSE FALSE
[2,] FALSE TRUE
[3,] TRUE TRUE
> x[!is.na(x)]
[1] 1 2 3