问题
I have Matlab .m script that sets and trains Neural network ("nn") using Matlab's Neural network toolbox. The script launches some GUI that shows trainig progress etc. The training of nn usually takes long time.
I'm doing these experiments on computer with 64 processor cores. I want to train several networks at the same time without having to run multiple Matlab sessions. So I want to:
- Start training of neural network
- Modify script that creates network to create different one
- Start training of modified network
- Modify script to create yet another network...
- Repeat steps 1-4 several times
The problem is that when I run the scrip it blocks Matlab terminal so I cannot do anything else until the script executes its last command - and that takes long. How can I run all those computations in parallel? I do have Matlab parallel toolbox.
EDIT: Matlab bug??
Update: This problem seems to happen only on R2012a, looks like fixed on R2012b.
There is very strange error when I try command sequence recommended in Edric's answer. Here is my code:
>> job = batch(c, @nn, 1, {A(:, 1:end -1), A(:, end)});
>> wait(job);
>> r = fetchOutputs(job)
Error using parallel.Job/fetchOutputs (line 677)
An error occurred during execution of Task with ID 1.
Caused by:
Error using nntraintool (line 35)
Java is not available.
Here are the lines 27-37 of nntraintool
(part of Matlab's Neural networks toolkit) where error originated:
if ~usejava('swing')
if (nargin == 1) && strcmp(command,'check')
result = false;
result2 = false;
return
else
disp('java used');
error(message('nnet:Java:NotAvailable'));
end
end
So it looks like the problem is that GUI (because Swing is not available) cannot be used when job is executed using batch
command. The strange thing is that the nn
function does not launch any GUI in it's current form. The error is caused by train that launches GUI by default but in nn
I have switched that off:
net.trainParam.showWindow = false;
net = train(net, X, y);
More interestingly if the same nn
function is launched normally (>> nn(A(:, 1:end -1), A(:, end));
) it never enters the outer if-then statement of nntraintool
on line 27 (I have checked that using debugger). So using the same function, the same arguments expression ~usejava('swing')
evaluates to 0
when command is launched normally but to 1 when launched using batch
.
What do you think about this? It looks like ugly Matlab or Neural networks toolbox bug :(((
回答1:
With Parallel Computing Toolbox, you can run up to 12 'local workers' to execute your scripts (to run more than that, you'd need to purchase additional MATLAB Distributed Computing Server licences). Given your workflow, the best thing might be to use the BATCH command to submit a series of non-interactive jobs. Note that you will not be able to see any GUI from the workers. You might do something like this (using R2012a+ syntax):
c = parcluster('local'); % get the 'local' cluster object
job = batch(c, 'myNNscript'); % submit script for execution
% now edit 'myNNscript'
job2 = batch(c, 'myNNscript'); % submit script for execution
...
wait(job); load(job) % get the results
Note that the BATCH command automatically attaches a copy of the script to run to the job, so that you are free to make changes to it after submission.
来源:https://stackoverflow.com/questions/12948544/how-to-run-matlab-computations-in-parallel