How to set slurm/salloc for 1 gpu per task but let job use multiple gpus?
问题 We are looking for some advice with slurm salloc gpu allocations. Currently, given: % salloc -n 4 -c 2 -gres=gpu:1 % srun env | grep CUDA CUDA_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES=0 However, we desire more than just device 0 to be used. Is there a way to specify an salloc with srun/mpirun to get the following? CUDA_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES=1 CUDA_VISIBLE_DEVICES=2 CUDA_VISIBLE_DEVICES=3 This is desired such that each task gets 1