问题
I am using Sungrid6.2u5 ,I am trying to submit some jobs on 4 hosts, I need to run 50 jobs using all the 4 hosts but I want to inform the SGE that I want only 5 jobs to be run on the 4th host at any given time,how do I do that?
I am new to SunGrid.Could any one please point me to the SGE basics,I mean where do I get started? I found this online,
Beginner's Guide to Sun Grid Engine 6.2 by Daniel Templeton
but apparently this is intended for system administrators ,I am just a normal user who is trying to understand the SGE features.
Thanks,
回答1:
If you should not run more than 5 jobs on 4th node (let's call it computer04
), probably, it is not capable of running something more. In general, you are encouraged to specify amount of resources for you job properly to prevent cores overload and out-of-memory situation.
If you have totally 20 Gb on computer04
and your job uses 5 Gb, you can limit all your jobs to 5Gb
memory usage:
qsub -l vmem=5G my_work
The similar holds for disk amount:
qsub -l fsize=10G my_work
I found it is possible to run job on specific host with -l -h=
option.
qsub -l -h=computer04 -l vmem=5G my_work
for 5 jobs. Then use
qsub -l vmem=5G my_work
for other 45 jobs.
(More dirty way) You could do it without memory/disk restrictions:
qsub -l -h=computer04 my_work # 5 jobs
qsub -l -h="!computer04" my_work # for 45 jobs
If you have different queues or resources, and you could use them for different jobs. E.g., you have queue_4
that runs everything on computer04
, and queue_main
that is linked with other computers, then, you do
qsub -q queue_4 my_work
for 5 jobs, and
qsub -q queue_main my_work
for other jobs.
UPD on comment:
It is possible to force SGE denial of more than X
jobs for user/host. It should be done by queue administrator.
qconf -arqs
{
name max_jobs_per_computer04
description "maximal number of jobs for user1 on computer04 restricted to 5!"
enabled TRUE
limit users user1 hosts computer04 to slots=5
}
If you want to restrict your user only in submitting jobs of some kind for computer04
, you need to define complex parameter
as shown here.
来源:https://stackoverflow.com/questions/35827149/how-to-limit-the-number-of-jobs-on-a-host-using-sungrid