This is a follow-on question from this one: Generating same random variable in Rcpp and R
I\'m trying to speed up a vectorised call to rbinom of this form:
Following Dirk's response here:
Is there a way of fixing the code without using an explicit loop in the C++ code?
I don't think so. The code currently has this hard-wired: <...> so until one of us has sufficient [time] to extend this (and test it) will have to do the loop at your end.
Here's my implementation of a "vectorised" code:
library(Rcpp)
cppFunction("NumericVector cpprbinom(int n, double size, NumericVector prob) {
NumericVector v(n);
for (int i=0; i<n; i++) {v[i] = as<double>(rbinom(1, size, prob[i]));}
return(v); }")
r <- runif(1e6)
all.equal({set.seed(42); rbinom(length(r), 1, r)},
{set.seed(42); cpprbinom(length(r), 1, r)})
#TRUE
But the problem is (again citing Dirk),
And I suggest that before expending a lot of effort on this you check whether you are likely to do better than the R function rbinom. That R function is vectorized in C code and you are unlikely to make things much faster by using Rcpp, unless you want to use the random variates in another C++ function.
And it is actually slower (x3 on my machine), so at least such naive implementation as mine won't help:
library(microbenchmark)
microbenchmark(rbinom(length(r), 1, r), cpprbinom(length(r), 1, r))
Unit: milliseconds
expr min lq mean median uq max neval
rbinom(length(r), 1, r) 55.50856 56.09292 56.49456 56.45297 56.65897 59.42524 100
cpprbinom(length(r), 1, r) 117.63761 153.37599 154.94164 154.29623 155.37247 225.56535 100
EDIT: according to Romain's comment below, here's an advanced version, which is faster!
cppFunction(plugins=c("cpp11"), "NumericVector cpprbinom2(int n, double size, NumericVector prob) {
NumericVector v = no_init(n);
std::transform( prob.begin(), prob.end(), v.begin(), [=](double p){ return R::rbinom(size, p); });
return(v);}")
r <- runif(1e6)
all.equal({set.seed(42); rbinom(length(r), 1, r)},
{set.seed(42); cpprbinom(length(r), 1, r)},
{set.seed(42); cpprbinom2(length(r), 1, r)})
#TRUE
microbenchmark(rbinom(length(r), 1, r), cpprbinom(length(r), 1, r), cpprbinom2(length(r), 1, r))
Unit: milliseconds
expr min lq mean median uq max neval
rbinom(length(r), 1, r) 55.26412 56.00314 56.57814 56.28616 56.59561 60.01861 100
cpprbinom(length(r), 1, r) 113.72513 115.94758 122.81545 117.24708 119.95134 168.47246 100
cpprbinom2(length(r), 1, r) 36.67589 37.12182 38.95318 37.37436 37.97719 84.73516 100
Not a general solution, but I'm noticing that you set the size
argument to 1 in your call to rbinom
. If that's always the case, you can draw length(x)
uniform values and then comparing to x
. For instance:
set.seed(123)
#create the values
x<-runif(1000000)
system.time(res<-rbinom(length(x),1 ,x))
# user system elapsed
#0.068 0.000 0.070
system.time(res2<-as.integer(runif(length(x))<x))
# user system elapsed
#0.044 0.000 0.046
Not a huge gain, but maybe you can save some little time if you call runif
from C++, avoiding some overhead.