How to run Matlab computations in parallel

空扰寡人 提交于 2019-12-22 12:19:57

问题


I have Matlab .m script that sets and trains Neural network ("nn") using Matlab's Neural network toolbox. The script launches some GUI that shows trainig progress etc. The training of nn usually takes long time.

I'm doing these experiments on computer with 64 processor cores. I want to train several networks at the same time without having to run multiple Matlab sessions. So I want to:

  1. Start training of neural network
  2. Modify script that creates network to create different one
  3. Start training of modified network
  4. Modify script to create yet another network...
  5. Repeat steps 1-4 several times

The problem is that when I run the scrip it blocks Matlab terminal so I cannot do anything else until the script executes its last command - and that takes long. How can I run all those computations in parallel? I do have Matlab parallel toolbox.


EDIT: Matlab bug??

Update: This problem seems to happen only on R2012a, looks like fixed on R2012b.

There is very strange error when I try command sequence recommended in Edric's answer. Here is my code:

 >> job = batch(c, @nn, 1, {A(:, 1:end -1), A(:, end)});
 >> wait(job);
 >> r = fetchOutputs(job)
 Error using parallel.Job/fetchOutputs (line 677)
 An error occurred during execution of Task with ID 1.

 Caused by:
    Error using nntraintool (line 35)
    Java is not available.

Here are the lines 27-37 of nntraintool (part of Matlab's Neural networks toolkit) where error originated:

if ~usejava('swing')
  if (nargin == 1) && strcmp(command,'check')
    result = false;
    result2 = false;
    return
  else

    disp('java used');
    error(message('nnet:Java:NotAvailable'));
  end
end 

So it looks like the problem is that GUI (because Swing is not available) cannot be used when job is executed using batch command. The strange thing is that the nn function does not launch any GUI in it's current form. The error is caused by train that launches GUI by default but in nn I have switched that off:

net.trainParam.showWindow = false;
net = train(net, X, y);

More interestingly if the same nn function is launched normally (>> nn(A(:, 1:end -1), A(:, end));) it never enters the outer if-then statement of nntraintool on line 27 (I have checked that using debugger). So using the same function, the same arguments expression ~usejava('swing') evaluates to 0 when command is launched normally but to 1 when launched using batch.

What do you think about this? It looks like ugly Matlab or Neural networks toolbox bug :(((


回答1:


With Parallel Computing Toolbox, you can run up to 12 'local workers' to execute your scripts (to run more than that, you'd need to purchase additional MATLAB Distributed Computing Server licences). Given your workflow, the best thing might be to use the BATCH command to submit a series of non-interactive jobs. Note that you will not be able to see any GUI from the workers. You might do something like this (using R2012a+ syntax):

c = parcluster('local'); % get the 'local' cluster object
job = batch(c, 'myNNscript'); % submit script for execution
% now edit 'myNNscript'
job2 = batch(c, 'myNNscript'); % submit script for execution
...
wait(job); load(job) % get the results

Note that the BATCH command automatically attaches a copy of the script to run to the job, so that you are free to make changes to it after submission.



来源:https://stackoverflow.com/questions/12948544/how-to-run-matlab-computations-in-parallel

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!