Why is the value of a `tf.constant()` stored multiple times in memory in TensorFlow?

前端 未结 2 1925
日久生厌
日久生厌 2020-12-21 01:02

I read that (in TensorFlow):

the value of a tf.constant() is stored multiple times in memory.

Why is the value of a tf.con

相关标签:
2条回答
  • 2020-12-21 01:18

    Because data for a constant tensor is embedded into graph definition. This means this data is stored both in the client, which maintains the graph definition, and in the runtime, which allocates it's own memory for all tensors.

    IE, try

    a = tf.constant([1,2])
    tf.get_default_graph().as_graph_def()
    

    You'll see

        dtype: DT_INT32
        tensor_shape {
          dim {
            size: 2
          }
        }
        tensor_content: "\001\000\000\000\002\000\000\000"
      }
    

    The tensor_content field is the raw content, same as np.array([1,2], dtype=np.int32).tobytes().

    Now, to see the runtime allocation, you can run with export TF_CPP_MIN_LOG_LEVEL=1.

    If you evaluate anything using a you'll see something like this

    2017-02-24 16:13:58: I tensorflow/core/framework/log_memory.cc:35] __LOG_MEMORY__ MemoryLogTensorOutput { step_id: 1 kernel_name: "Const_1/_1" tensor { dtype: DT_INT32 shape { dim { size: 2 } } allocation_description { requested_bytes: 8 allocated_bytes: 256 allocator_name: "cuda_host_bfc" allocation_id: 1 ptr: 8605532160 } } }
    

    This means the runtime asked to allocate 8 bytes, and TF actually allocated 256 bytes. (the choices on how much data to actually allocate are somewhat arbitrary at the moment - bfc_allocator.cc )

    Having constants embedded in the graph makes it easier to do some graph-based optimizations like constant folding . But this also means that large constants are inefficient. Also, using large constants is a common cause of exceeding 2GB limit for size of graph.

    0 讨论(0)
  • 2020-12-21 01:28

    They are referring to the fact that when initializing the constant one copy of the constant is stored as a numpy array and another copy is stored in tensorflow. The two copies exist while it is initializing the constant.

    0 讨论(0)
提交回复
热议问题