This is a simple thing which I just couldn\'t figure out how to do.
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load()
. It would help you understand what happened when you called net.load(weight_path, session)
.
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array)
which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
I would like to kindly remind you the following points:
var.assign(data)
where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training. var=tf.Variable(shape=data.shape)
. Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False)
, which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.