na.locf and inverse.rle in Rcpp

前端 未结 1 373
情书的邮戳
情书的邮戳 2020-12-18 06:19

I wanted to check if there is any pre-existing trick for na.locf (from zoo package), rle and inverse.rle in RCpp

相关标签:
1条回答
  • 2020-12-18 06:31

    The only thing I'd say is that you are testing for NA twice for each value when you only need to do it once. Testing for NA is not a free operation. Perhaps something like this:

    //[[Rcpp::export]]
    NumericVector naLocf(NumericVector x) {
        int n = x.size() ;
        double v = x[0]
        for( int i=1; i<n; i++){
            if( NumericVector::is_na(x[i]) ) {
                x[i] = v ;
            } else {
                v = x[i] ;    
            }
        }
    
        return x;
    }
    

    This still however does unnecessary things, like setting v every time when we could only do it for the last time we don't see NA. We can try something like this:

    //[[Rcpp::export]]
    NumericVector naLocf3(NumericVector x) {
        double *p=x.begin(), *end = x.end() ;
        double v = *p ; p++ ;
    
        while( p < end ){
            while( p<end && !NumericVector::is_na(*p) ) p++ ;
            v = *(p-1) ;
            while( p<end && NumericVector::is_na(*p) ) {
                *p = v ;
                p++ ;
            }
        }
    
        return x;
    }
    

    Now, we can try some benchmarks:

    x <- rnorm(1e6)
    x[sample(1:1e6, 1000)] <- NA 
    require(microbenchmark)
    microbenchmark( naLocf1(x), naLocf2(x), naLocf3(x) )
    #  Unit: milliseconds
    #       expr      min       lq   median       uq      max neval
    # naLocf1(x) 6.296135 6.323142 6.339132 6.354798 6.749864   100
    # naLocf2(x) 4.097829 4.123418 4.139589 4.151527 4.266292   100
    # naLocf3(x) 3.467858 3.486582 3.507802 3.521673 3.569041   100
    
    0 讨论(0)
提交回复
热议问题