How to run Tensorflow Estimator on multiple GPUs with data parallelism

后端 未结 5 806
青春惊慌失措
青春惊慌失措 2021-01-31 06:17

I have a standard tensorflow Estimator with some model and want to run it on multiple GPUs instead of just one. How can this be done using data parallelism?

I searched

相关标签:
5条回答
  • 2021-01-31 06:49

    I think this is all you need.

    Link: https://www.youtube.com/watch?v=bRMGoPqsn20

    More Details: https://www.tensorflow.org/api_docs/python/tf/distribute/Strategy

    Explained: https://medium.com/tensorflow/multi-gpu-training-with-estimators-tf-keras-and-tf-data-ba584c3134db

    NUM_GPUS = 8
    dist_strategy = tf.contrib.distribute.MirroredStrategy(num_gpus=NUM_GPUS)
    config = tf.estimator.RunConfig(train_distribute=dist_strategy)
    estimator = tf.estimator.Estimator(model_fn,model_dir,config=config)
    

    UPDATED

    With TF-2.0 and Keras you may use this (https://www.tensorflow.org/tutorials/distribute/keras)

    0 讨论(0)
  • 2021-01-31 06:55

    You can find an example using tf.distribute.MirroredStrategy and tf.estimator.train_and_evaluate here.

    0 讨论(0)
  • 2021-01-31 06:57

    You can use scope and device for that:

     with tf.variable_scope(tf.get_variable_scope()):
      for i in xrange(FLAGS.num_gpus):
        with tf.device('/gpu:%d' % i):
          with tf.name_scope('%s_%d' % (cifar10.TOWER_NAME, i)) as scope:
    

    Full example there: https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_multi_gpu_train.py

    0 讨论(0)
  • 2021-01-31 06:58

    I think tf.contrib.estimator.replicate_model_fn is a cleaner solution. The following is from tf.contrib.estimator.replicate_model_fn documentation,

    ...
    def model_fn(...):  # See `model_fn` in `Estimator`.
      loss = ...
      optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
      optimizer = tf.contrib.estimator.TowerOptimizer(optimizer)
      if mode == tf.estimator.ModeKeys.TRAIN:
        #  See the section below on `EstimatorSpec.train_op`.
        return EstimatorSpec(mode=mode, loss=loss,
                             train_op=optimizer.minimize(loss))
    
      #  No change for `ModeKeys.EVAL` or `ModeKeys.PREDICT`.
      return EstimatorSpec(...)
    ...
    classifier = tf.estimator.Estimator(
      model_fn=tf.contrib.estimator.replicate_model_fn(model_fn))
    

    What you need to do is to wrap optimizer with tf.contrib.estimator.TowerOptimize and model_fn() with tf.contrib.estimator.replicate_model_fn(). I followed the description and make an TPU squeezenet model work on a machine with 4 GPUs. My modifications here.

    0 讨论(0)
  • 2021-01-31 07:14

    The standard example is: https://github.com/tensorflow/tensorflow/blob/r1.4/tensorflow/contrib/learn/python/learn/estimators/estimator.py

    One way to run it data-parallel would be to loop over available GPU devices, and send chunks of your batch to copied versions of your model (all done within your model_fn), then merge the results.

    0 讨论(0)
提交回复
热议问题