In my thesis I need to perform a lot of simulation studies, which all takes quite a while. My computer has 4 cores, so I have been wondering if it is possible to run for exa
All you need to do (assuming you use Unix/Linux) is run a R batch command and put it in the background. This will automatically allocate it to a CPU.
At the shell, do:
/your/path/$ nohup R CMD BATCH --no-restore my_model1.R &
/your/path/$ nohup R CMD BATCH --no-restore my_model2.R &
/your/path/$ nohup R CMD BATCH --no-restore my_model3.R &
/your/path/$ nohup R CMD BATCH --no-restore my_model4.R &
executes the commands, will save the printout in the file my_model1.Rout,and saves all created R objects in the file.RData. This will run each model on a different CPU. The run of the session and output will be put in the output files.
In case of you doing it over the Internet, via a terminal, you will need to use the nohup command. Otherwise, upon exiting the session, the processes will terminate.
/your/path/$ nohup R CMD BATCH --no-restore my_model1.R &
If you want to give processes a low priority, you do:
/your/path/$ nohup nice -n 19 R CMD BATCH --no-restore my_model.R &
You'd do best to include some code at the beginning of the script to load and attach the relevant data file.
NEVER do simply
/your/path/$ nohup R CMD BATCH my_model1.R &
This will slurp the .RData file (all the funny objects there too), and will seriously compromise reproducibility. That is to say,
--no-restore
or
--vanilla
are your dear friends.
If you have too many models, I suggest doing computation on a cloud account, because you can have more CPU and RAM. Depending on what you are doing, and the R package, models can take hours on current hardware.
I've learned this the hard way, but there's a nice document here:
http://users.stat.umn.edu/~geyer/parallel/parallel.pdf
HTH.