I could get access to a computing cluster, specifically one node with two 12-Core CPUs, which is running with Slurm Workload Manager.
I would like to run TensorFlow
You can simply pass a batch script to slurm with the sbatch
command like such
sbatch --partition=part start.sh
listing available partitions can be done with sinfo
.
start.sh (possible configuration):
#!/bin/sh
#SBATCH -N 1 # nodes requested
#SBATCH -n 1 # tasks requested
#SBATCH -c 10 # cores requested
#SBATCH --mem=32000 # memory in Mb
#SBATCH -o outfile # send stdout to outfile
#SBATCH -e errfile # send stderr to errfile
python run.py
whereas run.py contains the script you want to be executed with slurm i.e. your tensorflow code.
You can look up the details here: https://slurm.schedmd.com/sbatch.html
It's relatively simple.
Under the simplifying assumptions that you request one process per host, slurm will provide you with all the information you need in environment variables, specifically SLURM_PROCID, SLURM_NPROCS and SLURM_NODELIST.
For example, you can initialize your task index, the number of tasks and the nodelist as follows:
from hostlist import expand_hostlist
task_index = int( os.environ['SLURM_PROCID'] )
n_tasks = int( os.environ['SLURM_NPROCS'] )
tf_hostlist = [ ("%s:22222" % host) for host in
expand_hostlist( os.environ['SLURM_NODELIST']) ]
Note that slurm gives you a host list in its compressed format (e.g., "myhost[11-99]"), that you need to expand. I do that with module hostlist by Kent Engström, available here https://pypi.python.org/pypi/python-hostlist
At that point, you can go right ahead and create your TensorFlow cluster specification and server with the information you have available, e.g.:
cluster = tf.train.ClusterSpec( {"your_taskname" : tf_hostlist } )
server = tf.train.Server( cluster.as_cluster_def(),
job_name = "your_taskname",
task_index = task_index )
And you're set! You can now perform TensorFlow node placement on a specific host of your allocation with the usual syntax:
for idx in range(n_tasks):
with tf.device("/job:your_taskname/task:%d" % idx ):
...
A flaw with the code reported above is that all your jobs will instruct Tensorflow to install servers listening at fixed port 22222. If multiple such jobs happen to be scheduled to the same node, the second one will fail to listen to 22222.
A better solution is to let slurm reserve ports for each job. You need to bring your slurm administrator on board and ask him to configure slurm so it allows you to ask for ports with the --resv-ports option. In practice, this requires asking them to add a line like the following in their slurm.conf:
MpiParams=ports=15000-19999
Before you bug your slurm admin, check what options are already configured, e.g., with:
scontrol show config | grep MpiParams
If your site already uses an old version of OpenMPI, there's a chance an option like this is already in place.
Then, amend my first snippet of code as follows:
from hostlist import expand_hostlist
task_index = int( os.environ['SLURM_PROCID'] )
n_tasks = int( os.environ['SLURM_NPROCS'] )
port = int( os.environ['SLURM_STEP_RESV_PORTS'].split('-')[0] )
tf_hostlist = [ ("%s:%s" % (host,port)) for host in
expand_hostlist( os.environ['SLURM_NODELIST']) ]
Good luck!