Printing from mclapply in R Studio

微笑、不失礼 提交于 2019-12-04 02:00:37

Parallel processing with GUI's is problematic. I write a lot of parallel code and it's constantly crashing my colleague's computer because he insists on using Rstudio instead of console R.

From what I read, RStudio "does not propagate the output of forked processes to the RStudio console. If you are doing this, it is best to start R via a shell."

This makes sense as a workaround for the RStudio people because parallel processing typically breaks GUI's when people try to output to the GUI from a bunch of different processes. It works in the console (albeit often not in order) but parallel processing gurus will pinch their noses when they hear about any I/O from a forked thread.

If you must have output from forked threads, save them in a string and return it. Then collect and output from the main process. Or just use a console for your parallel runs. What I tell my colleague is to do all his debugging and development in RStudio using lapply(), then switch to a console for the real run.

Just expanding a little on the solution used by the asker, i.e. writing to a file to check progress:

write.file = '/temp_output/R_progress'

time1 = proc.time()[3]
outstuff = unlist(mclapply(1:1000000, function(i){
  if (i %% 1000 == 0 ){
    file.create(write.file)
    fileConn<-file(write.file)
    writeLines(paste0(i,'/',nrow(loc),' ',(i/nrow(loc)*100)), fileConn)
    close(fileConn)
  }
  #do your stuff here
}, mc.cores=6))
print(proc.time()[3] - time1)

And then you can monitor from a console with

tail -c +0 -f '/temp_output/R_progress'

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!