I am trying to understand the basics of caffe, in particular to use with python.
My understanding is that the model definition (say a given neural net architecture) must
Let's take a look at one of the examples provided with BVLC/caffe: bvlc_reference_caffenet.
You'll notice that in fact there are 3 '.prototxt'
files:
The net architecture represented by train_val.prototxt
and deploy.prototxt
should be mostly similar. There are few main difference between the two:
Input data: during training one usually use a predefined set of inputs for training/validation. Therefore, train_val
usually contains an explicit input layer, e.g., "HDF5Data"
layer or a "Data"
layer. On the other hand, deploy
usually does not know in advance what inputs it will get, it only contains a statement:
input: "data"
input_shape {
dim: 10
dim: 3
dim: 227
dim: 227
}
that declares what input the net expects and what should be its dimensions.
Alternatively, One can put an "Input" layer:
layer {
name: "input"
type: "Input"
top: "data"
input_param { shape { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
deploy
. deploy
there is no loss and no back-propagation.In caffe, you supply a train_val.prototxt
describing the net, the train/val datasets and the loss. In addition, you supply a solver.prototxt
describing the meta parameters for training. The output of the training process is a .caffemodel
binary file containing the trained parameters of the net.
Once the net was trained, you can use the deploy.prototxt
with the .caffemodel
parameters to predict outputs for new and unseen inputs.