发表新帖

发表新帖

should I pre-install cran r packages on worker nodes when using sparkr

前端未结

关注

 3  1456

甜味超标 2021-01-14 15:42

I want to use r packages on cran such as forecast etc with sparkr and meet following two problems.

Should I pre-install all those packages on w

3条回答

滥情空心 (楼主)

2021-01-14 16:10

a better choice is to pass your local R package by spark-submit archive option, which means you do not need install R package in each worker and do not install and compile R package while running SparkR::dapply for time consuming waiting. for example:

Sys.setenv("SPARKR_SUBMIT_ARGS"="--master yarn-client --num-executors 40 --executor-cores 10 --executor-memory 8G --driver-memory 512M --jars /usr/lib/hadoop/lib/hadoop-lzo-0.4.15-cdh5.11.1.jar --files /etc/hive/conf/hive-site.xml --archives /your_R_packages/3.5.zip --files xgboost.model sparkr-shell")

when call SparkR::dapply function, let it call .libPaths("./3.5.zip/3.5") first. And you need notice that the server version R version must be equal your zip file R version.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题