Is there a way of determining how much GPU memory is in use by TensorFlow?

后端 未结 4 1634
醉梦人生
醉梦人生 2020-11-30 04:59

Tensorflow tends to preallocate the entire available memory on it\'s GPUs. For debugging, is there a way of telling how much of that memory is actually in use?

相关标签:
4条回答
  • 2020-11-30 05:16

    (1) There is some limited support with Timeline for logging memory allocations. Here is an example for its usage:

        run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
        run_metadata = tf.RunMetadata()
        summary, _ = sess.run([merged, train_step],
                              feed_dict=feed_dict(True),
                              options=run_options,
                              run_metadata=run_metadata)
        train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
        train_writer.add_summary(summary, i)
        print('Adding run metadata for', i)
        tl = timeline.Timeline(run_metadata.step_stats)
        print(tl.generate_chrome_trace_format(show_memory=True))
        trace_file = tf.gfile.Open(name='timeline', mode='w')
        trace_file.write(tl.generate_chrome_trace_format(show_memory=True))
    

    You can give this code a try with the MNIST example (mnist with summaries)

    This will generate a tracing file named timeline, which you can open with chrome://tracing. Note that this only gives an approximated GPU memory usage statistics. It basically simulated a GPU execution, but doesn't have access to the full graph metadata. It also can't know how many variables have been assigned to the GPU.

    (2) For a very coarse measure of GPU memory usage, nvidia-smi will show the total device memory usage at the time you run the command.

    nvprof can show the on-chip shared memory usage and register usage at the CUDA kernel level, but doesn't show the global/device memory usage.

    Here is an example command: nvprof --print-gpu-trace matrixMul

    And more details here: http://docs.nvidia.com/cuda/profiler-users-guide/#abstract

    0 讨论(0)
  • 2020-11-30 05:20

    There's some code in tensorflow.contrib.memory_stats that will help with this:

    from tensorflow.contrib.memory_stats.python.ops.memory_stats_ops import BytesInUse
    with tf.device('/device:GPU:0'):  # Replace with device you are interested in
      bytes_in_use = BytesInUse()
    with tf.Session() as sess:
      print(sess.run(bytes_in_use))
    
    0 讨论(0)
  • 2020-11-30 05:25

    Here's a practical solution that worked well for me:

    Disable GPU memory pre-allocation using TF session configuration:

    config = tf.ConfigProto()  
    config.gpu_options.allow_growth=True  
    sess = tf.Session(config=config)  
    

    run nvidia-smi -l (or some other utility) to monitor GPU memory consumption.

    Step through your code with the debugger until you see the unexpected GPU memory consumption.

    0 讨论(0)
  • 2020-11-30 05:41

    The TensorFlow profiler has improved memory timeline that is based on real gpu memory allocator information https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/profiler#visualize-time-and-memory

    0 讨论(0)
提交回复
热议问题