Running slurm script with multiple nodes, launch job steps with 1 task

前端 未结 1 669
孤城傲影
孤城傲影 2021-02-05 20:20

I am trying to launch a large number of job steps using a batch script. The different steps can be completely different programs and do need exactly one CPU each. First I tried

1条回答
  •  夕颜
    夕颜 (楼主)
    2021-02-05 21:11

    Found it! The nomenclature and the many command line options to slurm confused me. The solution is given by

    #!/bin/bash
    #SBATCH -o $HOME/slurm/slurm_out/%j.%N.out
    #SBATCH --error=$HOME/slurm/slurm_out/%j.%N.err_out
    #SBATCH --get-user-env
    #SBATCH -J test
    #SBATCH -D $HOME/slurm
    #SBATCH --export=NONE
    #SBATCH --ntasks=48
    
    NR_PROCS=$(($SLURM_NTASKS))
    for PROC in $(seq 0 $(($NR_PROCS-1)));
    do
        #My call looks like this:
        #srun --exclusive -N1 -n1 bash $PROJECT/call_shells/call_"$PROC".sh &
        srun --exclusive -N1 -n1 hostname &
        pids[${PROC}]=$!    #Save PID of this background process
    done
    for pid in ${pids[*]};
    do
        wait ${pid} #Wait on all PIDs, this returns 0 if ANY process fails
    done
    

    This specifies to run the job on exactly one node incorporating a single task only.

    0 讨论(0)
提交回复
热议问题