In R, what is the most efficient/idiomatic way to count the number of TRUE
values in a logical vector? I can think of two ways:
z <- sample(c
Another option which hasn't been mentioned is to use which
:
length(which(z))
Just to actually provide some context on the "which is faster question", it's always easiest just to test yourself. I made the vector much larger for comparison:
z <- sample(c(TRUE,FALSE),1000000,rep=TRUE)
system.time(sum(z))
user system elapsed
0.03 0.00 0.03
system.time(length(z[z==TRUE]))
user system elapsed
0.75 0.07 0.83
system.time(length(which(z)))
user system elapsed
1.34 0.28 1.64
system.time(table(z)["TRUE"])
user system elapsed
10.62 0.52 11.19
So clearly using sum
is the best approach in this case. You may also want to check for NA
values as Marek suggested.
Just to add a note regarding NA values and the which
function:
> which(c(T, F, NA, NULL, T, F))
[1] 1 4
> which(!c(T, F, NA, NULL, T, F))
[1] 2 5
Note that which only checks for logical TRUE
, so it essentially ignores non-logical values.
which
is good alternative, especially when you operate on matrices (check ?which
and notice the arr.ind
argument). But I suggest that you stick with sum
, because of na.rm
argument that can handle NA
's in logical vector.
For instance:
# create dummy variable
set.seed(100)
x <- round(runif(100, 0, 1))
x <- x == 1
# create NA's
x[seq(1, length(x), 7)] <- NA
If you type in sum(x)
you'll get NA
as a result, but if you pass na.rm = TRUE
in sum
function, you'll get the result that you want.
> sum(x)
[1] NA
> sum(x, na.rm=TRUE)
[1] 43
Is your question strictly theoretical, or you have some practical problem concerning logical vectors?