Parallelize function taking external pointers (XPtr)

最后都变了- 提交于 2020-01-14 13:55:43

问题


This question is neither a duplicate of this one nor of this one, which were about functions returning external pointers.

Here's the issue. The Rcpp code hereafter defines two functions, one which creates an XPtr, and another one which can work on the XPtr.

#include <Rcpp.h>
using namespace Rcpp;

//[[Rcpp::export]]
SEXP f(int n) {
  std::vector<int> * v = new std::vector<int>;

  for(int i = 0; i < n; i++)
    v->push_back(i);

  XPtr< std::vector<int> > p(v, true);
  return p;
}

//[[Rcpp::export]]
int g(XPtr< std::vector<int> > p, int i) {
  return (*p)[i];

And it works fine:

> x <- f(100)
> g(x, 45)
[1] 45

Let's try to parallelize calls to g. This works:

require(parallel)
test1 <- function(a) {
  cl <- makeForkCluster(nnodes=2)
  r <- parLapply(cl, 1:5, function(i) g(a,i) )
  stopCluster(cl)
  return(r)
}

Expected behaviour:

> unlist( test1(x) )
[1] 1 2 3 4 5

But this doesn't work:

test2 <- function(a) {
  cl <- makeForkCluster(nnodes=2)

  p <- g(a, 0)
  r <- parLapply(cl, 1:5, function(i) g(a,i) )
  stopCluster(cl)
  return(r)
}

Unexpected behaviour:

> test2(x)
Error in checkForRemoteErrors(val) : 
  2 nodes produced errors; first error: external pointer is not valid

The problem seems to arise from the fact that the external pointer is used once in the function before calling the slaves in the cluster. What explains this behavior, and is there a workaround? Many thanks in advance.


回答1:


At the beginning of your function, a is a promise, i.e. something that says evaluate a certain expression in a certain environment. When you access the variable, the expression is evaluated, so now a is an pointer, and that pointer is specific to the particular R instance. You can look at this using pryr::promise_info:

test2 <- function(a) {
  cl <- makeForkCluster(nnodes = 2)
  print(pryr::promise_info(a))
  p <- g(a, 0)
  print(pryr::promise_info(a))
  stopCluster(cl)
  return(r)
}

Output:

$code
x

$env
<environment: R_GlobalEnv>

$evaled
[1] FALSE

$value
NULL

$code
x

$env
NULL

$evaled
[1] TRUE

$value
<pointer: 0x565295e3a410>

One way around is to use eval(substitute(a)):

test2 <- function(a) {
  cl <- makeForkCluster(nnodes = 2)
  print(pryr::promise_info(a))
  p <- g(eval(substitute(a)), 0)
  print(pryr::promise_info(a))
  r <- parLapply(cl, 1:5, function(i) g(a,i) )
  stopCluster(cl)
  return(r)
}

I am sure there are better ways. Non-standard evaluation is still a bit foreign to me ...



来源:https://stackoverflow.com/questions/54535966/parallelize-function-taking-external-pointers-xptr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!