Printing from mclapply in R Studio

后端 未结 3 2156
迷失自我
迷失自我 2021-02-19 06:54

I am using mclapply from within RStudio and would like to have an output to the console from each process but this seems to be suppressed somehow (as mentioned for example here:

相关标签:
3条回答
  • 2021-02-19 07:05

    Parallel processing with GUI's is problematic. I write a lot of parallel code and it's constantly crashing my colleague's computer because he insists on using Rstudio instead of console R.

    From what I read, RStudio "does not propagate the output of forked processes to the RStudio console. If you are doing this, it is best to start R via a shell."

    This makes sense as a workaround for the RStudio people because parallel processing typically breaks GUI's when people try to output to the GUI from a bunch of different processes. It works in the console (albeit often not in order) but parallel processing gurus will pinch their noses when they hear about any I/O from a forked thread.

    If you must have output from forked threads, save them in a string and return it. Then collect and output from the main process. Or just use a console for your parallel runs. What I tell my colleague is to do all his debugging and development in RStudio using lapply(), then switch to a console for the real run.

    0 讨论(0)
  • 2021-02-19 07:15

    Here's a workaround which uses shell echo to print to R console in Rstudio:

    #' Function which prints a message using shell echo; useful for printing messages from inside mclapply when running in Rstudio
    message_parallel <- function(...){
      system(sprintf('echo "\n%s\n"', paste0(..., collapse="")))
    }
    
    0 讨论(0)
  • 2021-02-19 07:23

    Just expanding a little on the solution used by the asker, i.e. writing to a file to check progress:

    write.file = '/temp_output/R_progress'
    
    time1 = proc.time()[3]
    outstuff = unlist(mclapply(1:1000000, function(i){
      if (i %% 1000 == 0 ){
        file.create(write.file)
        fileConn<-file(write.file)
        writeLines(paste0(i,'/',nrow(loc),' ',(i/nrow(loc)*100)), fileConn)
        close(fileConn)
      }
      #do your stuff here
    }, mc.cores=6))
    print(proc.time()[3] - time1)
    

    And then you can monitor from a console with

    tail -c +0 -f '/temp_output/R_progress'

    0 讨论(0)
提交回复
热议问题