Why does TensorFlow always use GPU 0?

前端 未结 2 883
花落未央
花落未央 2021-02-19 04:07

I hit a problem when running TensorFlow inference on multiple-GPU setups.

Environment: Python 3.6.4; TensorFlow 1.8.0; Centos 7.3; 2 Nvidia Tesla P4

Her

相关标签:
2条回答
  • 2021-02-19 04:38

    You can use the GPUtil package to select unused gpus and filter the CUDA_VISIBLE_DEVICES environnement variable.

    This will allow you to run parallel experiments on all your gpus.

    # Import os to set the environment variable CUDA_VISIBLE_DEVICES
    import os
    import tensorflow as tf
    import GPUtil
    
    # Set CUDA_DEVICE_ORDER so the IDs assigned by CUDA match those from nvidia-smi
    os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
    
    # Get the first available GPU
    DEVICE_ID_LIST = GPUtil.getFirstAvailable()
    DEVICE_ID = DEVICE_ID_LIST[0] # grab first element from list
    
    # Set CUDA_VISIBLE_DEVICES to mask out all other GPUs than the first available device id
    os.environ["CUDA_VISIBLE_DEVICES"] = str(DEVICE_ID)
    
    # Since all other GPUs are masked out, the first available GPU will now be identified as GPU:0
    device = '/gpu:0'
    print('Device ID (unmasked): ' + str(DEVICE_ID))
    print('Device ID (masked): ' + str(0))
    
    # Run a minimum working example on the selected GPU
    # Start a session
    with tf.Session() as sess:
        # Select the device
        with tf.device(device):
            # Declare two numbers and add them together in TensorFlow
            a = tf.constant(12)
            b = tf.constant(30)
            result = sess.run(a+b)
            print('a+b=' + str(result))
    

    Reference: https://github.com/anderskm/gputil

    0 讨论(0)
  • 2021-02-19 04:52

    The device names might be different depending on your setup.

    Execute:

    from tensorflow.python.client import device_lib
    print(device_lib.list_local_devices())
    

    And try using the device name for your second GPU exactly as listed there.

    0 讨论(0)
提交回复
热议问题