For two logical vectors, x
and y
, of length > 1E8, what is the fastest way to calculate the 2x2 cross tabulations?
I suspect the answer is to w
Here's an answer with Rcpp
, tabulating only those entries that are not both 0
. I suspect there must be several ways to improve this, as this is unusually slow; it's also my first attempt with Rcpp
, so there may be some obvious inefficiencies associated with moving the data around. I wrote an example that is purposefully plain vanilla, which should let others demonstrate how this can be improved.
library(Rcpp)
library(inline)
doCrossTab <- cxxfunction(signature(x="integer", y = "integer"), body='
Rcpp::IntegerVector Vx(x);
Rcpp::IntegerVector Vy(y);
Rcpp::IntegerVector V(3);
for(int i = 0; i < Vx.length(); i++) {
if( (Vx(i) == 1) & ( Vy(i) == 1) ){ V[0]++; }
else if( (Vx(i) == 1) & ( Vy(i) == 0) ){ V[1]++; }
else if( (Vx(i) == 0) & ( Vy(i) == 1) ){ V[2]++; }
}
return( wrap(V));
', plugin="Rcpp")
Timing results for N = 3E8
:
user system elapsed
10.930 1.620 12.586
This takes more than 6X as long as func_find01B
in my 2nd answer.