I have a dataframe where some of the values are NA. I would like to remove these columns.
My data.frame looks like this
v1 v2
1 1 NA
2
A base R method related to the apply
answers is
Itun[!unlist(vapply(Itun, anyNA, logical(1)))]
v1
1 1
2 1
3 2
4 1
5 2
6 1
Here, vapply
is used as we are operating on a list, and, apply
, it does not coerce the object into a matrix. Also, since we know that the output will be logical vector of length 1, we can feed this to vapply
and potentially get a little speed boost. For the same reason, I used anyNA
instead of any(is.na())
.
You can use transpose twice:
newdf <- t(na.omit(t(df)))
Here's a convenient way to do it using the dplyr
function select_if()
. Combine not (!
), any()
and is.na()
, which is equivalent to selecting all columns that don't contain any NA values.
library(dplyr)
Itun %>%
select_if(~ !any(is.na(.)))
Alternatively, select(where(~FUNCTION))
can be used:
library(dplyr)
(df <- data.frame(x = letters[1:5], y = NA, z = c(1:4, NA)))
#> x y z
#> 1 a NA 1
#> 2 b NA 2
#> 3 c NA 3
#> 4 d NA 4
#> 5 e NA NA
# Remove columns where all values are NA
df %>%
select(where(~!all(is.na(.))))
#> x z
#> 1 a 1
#> 2 b 2
#> 3 c 3
#> 4 d 4
#> 5 e NA
# Remove columns with at least one NA
df %>%
select(where(~!any(is.na(.))))
#> x
#> 1 a
#> 2 b
#> 3 c
#> 4 d
#> 5 e
The data:
Itun <- data.frame(v1 = c(1,1,2,1,2,1), v2 = c(NA, 1, 2, 1, 2, NA))
This will remove all columns containing at least one NA
:
Itun[ , colSums(is.na(Itun)) == 0]
An alternative way is to use apply
:
Itun[ , apply(Itun, 2, function(x) !any(is.na(x)))]
Another alternative with the dplyr
package would be to make use of the Filter
function
Filter(function(x) !any(is.na(x)), Itun)
with data.table
would be a little more cumbersome
setDT(Itun)[,.SD,.SDcols=setdiff((1:ncol(Itun)),
which(colSums(is.na(Itun))>0))]