Should I prefer Rcpp::NumericVector over std::vector?

前端 未结 2 565
忘了有多久
忘了有多久 2020-12-08 06:08

Is there any reason why I should prefer Rcpp::NumericVector over std::vector?

For example, the two functions below

         


        
相关标签:
2条回答
  • 2020-12-08 06:26

    "If unsure, just time it."

    All it takes is to add these few lines to the file you already had:

    /*** R
    library(microbenchmark)
    x <- 1.0* 1:1e7   # make sure it is numeric
    microbenchmark(foo(x), bar(x), times=100L)
    */
    

    Then just calling sourceCpp("...yourfile...") generates the following result (plus warnings on signed/unsigned comparisons):

    R> library(microbenchmark)
    
    R> x <- 1.0* 1:1e7   # make sure it is numeric
    
    R> microbenchmark(foo(x), bar(x), times=100L)
    Unit: milliseconds
       expr     min      lq    mean  median      uq      max neval cld
     foo(x) 31.6496 31.7396 32.3967 31.7806 31.9186  54.3499   100  a 
     bar(x) 50.9229 51.0602 53.5471 51.1811 51.5200 147.4450   100   b
    R> 
    

    Your bar() solution needs to make a copy to create a R object in the R memory pool. foo() does not. That matters for large vectors that you run over many times. Here we see a ratio of close of about 1.8.

    In practice, it may not matter if you prefer one coding style over the other etc pp.

    0 讨论(0)
  • 2020-12-08 06:43

    Are equivalent when considering their working and benchmarked performance.

    1. I doubt that the benchmarks are accurate because going from a SEXP to std::vector<double> requires a deep copy from one data structure to another. (And as I was typing this, @DirkEddelbuettel ran a microbenchmark.)
    2. The markup of the Rcpp object (e.g. const Rcpp::NumericVector& x) is just visual sugar. By default, the object given is a pointer and as such can easily have a ripple modification effect (see below). Thus, there is no true match that exists with const std::vector<double>& x that effectively "locks" and "passes a references".

    Can using std::vector<double> lead to any possible problems when interacting with R?

    In short, no. The only penalty that is paid is the transference between objects.

    The gain over this transference is the fact that modifying a value of a NumericVector that is assigned to another NumericVector will not cause a domino update. In essence, each std::vector<T> is a direct copy of the other. Therefore, the following couldn't happen:

    #include<Rcpp.h>
    
    // [[Rcpp::export]]
    void test_copy(){
        NumericVector A = NumericVector::create(1, 2, 3);
        NumericVector B = A;
    
        Rcout << "Before: " << std::endl << "A: " << A << std::endl << "B: " << B << std::endl; 
    
        A[1] = 5; // 2 -> 5
    
        Rcout << "After: " << std::endl << "A: " << A << std::endl << "B: " << B << std::endl; 
    }
    

    Gives:

    test_copy()
    # Before: 
    # A: 1 2 3
    # B: 1 2 3
    # After: 
    # A: 1 5 3
    # B: 1 5 3
    

    Is there any reason why I should prefer Rcpp::NumericVector over std::vector<double>?

    There are a few reasons:

    1. As hinted previously, using Rcpp::NumericVector avoids a deep copy to and fro the C++ std::vector<T>.
    2. You gain access to the sugar functions.
    3. Ability to 'mark up' Rcpp object in C++ (e.g. adding attributes via .attr())
    0 讨论(0)
提交回复
热议问题