Calling 'mypackage' function within public worker

夙愿已清 提交于 2019-12-18 09:26:51

问题


I know the problem I have is a thread-safety issue. As the code I have now will execute with 'seThreadOptions(1)'. My question is what would be a good practice to overcome this.

I know this: Threadsafe function pointer with Rcpp and RcppParallel via std::shared_ptr Will come into play somehow. And I have also been thinking/playing around with making the internal function part of the structure for the parallel worker. Realistically, I am calling two internal functions and I would like one to be variable and the other to be constant, this tends me to think that i will need 2 solutions.

The error is that the R session, in rstudio, crashes. Two things of note here: 1. if I 'setThreadOptions(1)' this runs fine. 2. if I move 'myfunc' into the main cpp file and make the call simply 'myfunc' this also runs fine.

Here is a detailed example:

First cpp file:

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::interfaces(cpp)]]
// [[Rcpp::plugins(cpp11)]]
#include "RcppArmadillo.h"
using namespace arma;
using namespace std;

// [[Rcpp::export]]
double myfunc(arma::vec vec_in){

  int Len = arma::size(vec_in)[0];
  return (vec_in[0] +vec_in[1])/Len;
}

Second,cpp file:

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(ParallelExample)]]

#include "RcppArmadillo.h"
#include "RcppParallel.h"
#include "ParallelExample.h"
#include <random>
#include <memory>
#include <math.h>

using namespace Rcpp;
using namespace arma;
using namespace RcppParallel;
using namespace std;

struct PARALLEL_WORKER : public Worker{

  const arma::vec &input;
  arma::vec &output;

  PARALLEL_WORKER(const arma::vec &input, arma::vec &output) : input(input), output(output) {}

  void operator()(std::size_t begin, std::size_t end){


    std::mt19937 engine(1);

    // Create a loop that runs through a selected section of the total Boot_reps
    for( int k = begin; k < end; k ++){
      engine.seed(k);
      arma::vec index = input;
      std::shuffle( index.begin(), index.end(), engine);

      output[k] = ParallelExample::myfunc(index);
  }
}

};

// [[Rcpp::export]]
arma::vec Parallelfunc(int Len_in){

  arma::vec input = arma::regspace(0, 500);
  arma::vec output(Len_in);

  PARALLEL_WORKER  parallel_woker(input, output);
  parallelFor( 0, Len_in, parallel_woker);
  return output;
}

Makevars, as I am using a macintosh:

CXX_STD = CXX11

PKG_CXXFLAGS +=  -I../inst/include

And Namespace:

exportPattern("^[[:alpha:]]+")
importFrom(Rcpp, evalCpp)
importFrom(RcppParallel,RcppParallelLibs)
useDynLib(ParallelExample, .registration = TRUE)

export(Parallelfunc)

回答1:


When you call ParallelExample::myfunc, you are calling a function defined in inst/include/ParallelExample_RcppExport.h, which uses the R API. This is something one must not do in a parallel context. I see two possibilities:

  1. Convert myfunc to header-only and include it in int/include/ParallelExample.h.
  2. If the second cpp file is within the same package, put a suitable declaration for myfunc into src/first.h, include that file in both src/first.cpp and src/second.cpp, and call myfunc instead of ParallelExample::myfunc. After all, it is not necessary to register a function with R if you only want to call it within the same package. Registring with R is for functions that are called from the outside.



回答2:


In some ways this kinda defeats the purpose of the built in interface cpp feature of Rcpp.

First, cpp saved as 'ExampleInternal.h':

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
#include "RcppArmadillo.h"
using namespace arma;
using namespace std;

namespace ExampleInternal
{

  double myfunc3(arma::vec vec_in){

    int Len = arma::size(vec_in)[0];
    return (vec_in[0] +vec_in[1])/Len;
  }


}

and second:

#include "ParallelExample.h"
#include "ExampleInternal.h"
#include <random>
#include <memory>
#include <math.h>

using namespace Rcpp;
using namespace arma;
using namespace RcppParallel;
using namespace ExampleInternal;
using namespace std;

struct PARALLEL_WORKER : public Worker{

  const arma::vec &input;
  arma::vec &output;

  PARALLEL_WORKER(const arma::vec &input, arma::vec &output) : input(input), output(output) {}

  void operator()(std::size_t begin, std::size_t end){


    std::mt19937 engine(1);

    // Create a loop that runs through a selected section of the total Boot_reps
    for( int k = begin; k < end; k ++){
      engine.seed(k);
      arma::vec index = input;
      std::shuffle( index.begin(), index.end(), engine);

      output[k] = ExampleInternal::myfunc3(index);
  }
}

};

// [[Rcpp::export]]
arma::vec Parallelfunc(int Len_in){

  arma::vec input = arma::regspace(0, 500);
  arma::vec output(Len_in);

  PARALLEL_WORKER  parallel_woker(input, output);
  parallelFor( 0, Len_in, parallel_woker);
  return output;
}


来源:https://stackoverflow.com/questions/51976337/calling-mypackage-function-within-public-worker

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!