Net surgery: How to reshape a convolution layer of a caffemodel file in caffe?

☆樱花仙子☆ 提交于 2019-12-22 09:48:08

问题


I'm trying to reshape the size of a convolution layer of a caffemodel (This is a follow-up question to this question). Although there is a tutorial on how to do net surgery, it only shows how to copy weight parameters from one caffemodel to another of the same size.
Instead I need to add a new channel (all 0) to my convolution filter such that it changes its size from currently (64x3x3x3) to (64x4x3x3).

Say the convolution layer is called 'conv1'. This is what I tried so far:

# Load the original network and extract the fully connected layers' parameters.
net = caffe.Net('../models/train.prototxt', 
                '../models/train.caffemodel', 
                caffe.TRAIN)

Now I can perform this:

net.blobs['conv1'].reshape(64,4,3,3);
net.save('myNewTrainModel.caffemodel');

But the saved model seems not to have changed. I've read that the actual weights of the convolution are stored rather in net.params['conv1'][0].data than in net.blobs but I can't figure out how to reshape the net.params object. Does anyone have an idea?


回答1:


As you well noted, net.blobs does not store the learned parameters/weights, but rather stores the result of applying the filters/activations on the net's input. The learned weights are stored in net.params. (see this for more details).

AFAIK, you cannot directly reshape net.params and add a channel.
What you can do, is have two nets deploy_trained_net_with_3ch.prototxt and deploy_empty_net_with_4ch.prototxt. The two files can be almost identical apart from the input shape definition and the first layer's name.
Then you can load both nets to python and copy the relevant part:

net3ch = caffe.Net('deploy_trained_net_with_3ch.prototxt', 'train.caffemodel', caffe.TEST) 
net4ch = caffe.Net('deploy_empty_net_with_4ch.prototxt', 'train.caffemodel', caffe.TEST) 

since all layer names are identical (apart from conv1) net4ch.params will have the weights of train.caffemodel. As for the first layer, you can now manually copy the relevant part:

net4ch.params['conv1_4ch'][0].data[:,:3,:,:] = net3ch.params['conv1'][0].data[...]

and finally:

net4ch.save('myNewTrainModel.caffemodel')


来源:https://stackoverflow.com/questions/39898291/net-surgery-how-to-reshape-a-convolution-layer-of-a-caffemodel-file-in-caffe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!