Expected tensorflow model size from learned variables

后端 未结 1 754
余生分开走
余生分开走 2021-01-05 09:54

When training convolutional neural networks for image classification tasks we generally want our algorithm to learn the filters (and biases) that transform a given image to

相关标签:
1条回答
  • 2021-01-05 10:24

    Adding up all those variables we would expect to get a model.ckpt.data file of size 12.45Mb

    Traditionally, most of model parameters are in the first fully connected layer, in this case wd1. Computing only its size yields:

    7*7*128 * 1024 * 4 = 25690112
    

    ... or 25.6Mb. Note 4 coefficient, because the variable dtype=tf.float32, i.e. 4 bytes per parameter. Other layers also contribute to the model size, but not so drastically.

    As you can see, your estimate 12.45Mb is a bit off (did you use 16bit per param?). The checkpoint also stores some general information, hence the overhead around 25%, which is still big, but not 300%.

    [Update]

    The model in question actually has FC1 layer of shape [7*7*64, 1024], as was clarified. So the calculated above size should be 12.5Mb, indeed. That made me look into the saved checkpoint more carefully.

    After inspecting it, I noticed other big variables that I missed originally:

    ...
    Variable_2 (DT_FLOAT) [3136,1024]
    Variable_2/Adam (DT_FLOAT) [3136,1024]
    Variable_2/Adam_1 (DT_FLOAT) [3136,1024]
    ...
    

    The Variable_2 is exactly wd1, but there are 2 more copies for the Adam optimizer. These variables are created by the Adam optimizer, they're called slots and hold the m and v accumulators for all trainable variables. Now the total size makes sense.

    You can run the following code to compute the total size of the graph variables - 37.47Mb:

    var_sizes = [np.product(list(map(int, v.shape))) * v.dtype.size
                 for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)]
    print(sum(var_sizes) / (1024 ** 2), 'MB')
    

    So the overhead is actually pretty small. Extra size is due to the optimizer.

    0 讨论(0)
提交回复
热议问题