It seems that tf.train.replica_device_setter
doesn\'t allow specify gpu which work with.
What I want to do is like below:
with tf.devi
I didn't check previous versions, but in Tensorflow 1.4/1.5, you can specify devices in replica_device_setter(worker_device='job:worker/task:%d/gpu:%d' %
(FLAGS.task_index, i), cluster=self.cluster)
.
See tensorflow/python/training/device_setter.py
line 199-202:
if ps_ops is None:
# TODO(sherrym): Variables in the LOCAL_VARIABLES collection should not be
# placed in the parameter server.
ps_ops = ["Variable", "VariableV2", "VarHandleOp"]
Thanks to the code provided by @Yaroslav Bulatov, but his protocol is different from replica_device_setter
, and may fail in some cases.
If your parameters are not sharded, you could do it with a simplified version of replica_device_setter
like below:
def assign_to_device(worker=0, gpu=0, ps_device="/job:ps/task:0/cpu:0"):
def _assign(op):
node_def = op if isinstance(op, tf.NodeDef) else op.node_def
if node_def.op == "Variable":
return ps_device
else:
return "/job:worker/task:%d/gpu:%d" % (worker, gpu)
return _assign
with tf.device(assign_to_device(1, 2)):
# this op goes on worker 1 gpu 2
my_op = tf.ones(())