TensorFlow: how to log GPU memory (VRAM) utilization?

后端 未结 1 1292
闹比i
闹比i 2020-12-14 02:55

TensorFlow always (pre-)allocates all free memory (VRAM) on my graphics card, which is ok since I want my simulations to run as fast as possible on my workstation.

H

1条回答
  •  醉梦人生
    2020-12-14 03:05

    Update, can use TensorFlow ops to query allocator:

    # maximum across all sessions and .run calls so far
    sess.run(tf.contrib.memory_stats.MaxBytesInUse())
    # current usage
    sess.run(tf.contrib.memory_stats.BytesInUse())
    

    Also you can get detailed information about session.run call including all memory being allocations during run call by looking at RunMetadata. IE something like this

    run_metadata = tf.RunMetadata()
    sess.run(c, options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE, output_partition_graphs=True), run_metadata=run_metadata)
    

    Here's an end-to-end example -- take column vector, row vector and add them to get a matrix of additions:

    import tensorflow as tf
    
    no_opt = tf.OptimizerOptions(opt_level=tf.OptimizerOptions.L0,
                                 do_common_subexpression_elimination=False,
                                 do_function_inlining=False,
                                 do_constant_folding=False)
    config = tf.ConfigProto(graph_options=tf.GraphOptions(optimizer_options=no_opt),
                            log_device_placement=True, allow_soft_placement=False,
                            device_count={"CPU": 3},
                            inter_op_parallelism_threads=3,
                            intra_op_parallelism_threads=1)
    sess = tf.Session(config=config)
    
    with tf.device("cpu:0"):
        a = tf.ones((13, 1))
    with tf.device("cpu:1"):
        b = tf.ones((1, 13))
    with tf.device("cpu:2"):
        c = a+b
    
    sess = tf.Session(config=config)
    run_metadata = tf.RunMetadata()
    sess.run(c, options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE, output_partition_graphs=True), run_metadata=run_metadata)
    with open("/tmp/run2.txt", "w") as out:
      out.write(str(run_metadata))
    

    If you open run.txt you'll see messages like this:

      node_name: "ones"
    
          allocation_description {
            requested_bytes: 52
            allocator_name: "cpu"
            ptr: 4322108320
          }
      ....
    
      node_name: "ones_1"
    
          allocation_description {
            requested_bytes: 52
            allocator_name: "cpu"
            ptr: 4322092992
          }
      ...
      node_name: "add"
          allocation_description {
            requested_bytes: 676
            allocator_name: "cpu"
            ptr: 4492163840
    

    So here you can see that a and b allocated 52 bytes each (13*4), and the result allocated 676 bytes.

    0 讨论(0)
提交回复
热议问题