I am working on a very time intensive analysis using the LQMM package in R. I set the model to start running on Thursday, it is now Monday, and is still running. I am confid
It sounds like you want to use parallel computing to make a single call of the lqmm
function execute more quickly. To do that, you either have to:
lqmm
into multiple function calls;lqmm
.Some functions can be split up into multiple smaller pieces by specifying a smaller iteration value. Examples include parallelizing randomForest
over the ntree
argument, or parallelizing kmeans
over the nstart
argument. Another common case is to split the input data into smaller pieces, operate on the pieces in parallel, and then combine the results. That is often done when the input data is a data frame or a matrix.
But many times in order to parallelize a function you have to modify it. It may actually be easier because you may not have to figure out how to split up the problem and combine the partial results. You may only need to convert an lapply
call into a parallel lapply
, or convert a for loop into a foreach loop. However, it's often time consuming to understand the code. It's also a good idea to profile the code so that your parallelization really speeds up the function call.
I suggest that you download the source distribution of the lqmm
package and start reading the code. Try to understand it's structure and get an idea which loops could be executed in parallel. If you're lucky, you might figure out a way to split one call into multiple calls, but otherwise you'll have to rebuild a modified version of the package on your machine.