How can I use caffe convnet to detect facial expressions?
I have a image dataset, Cohn Kanade, and I want to train caffe convnet with this dataset. Caffe has a document
Caffe supports multiple formats for the input data (HDF5/lmdb/leveldb). It's just a matter of picking the one you feel most comfortable with. Here are a couple of options:
convert_imageset
is one of the command line tools you get from building caffe.
Usage is along the lines of:
You need to write up a text file where each line starts with the filename of the image followed by a scalar label (e.g. 0, 1, 2,...)
Datum
class:This requires building caffe's python interface. Here you write some python code that:
numpy
array.Datum
objectDatum
object.Datum
class has a member called label
you can set it to the AU class from your CK dataset, if that is what you want your network to classify.Datum
object to the db and moves on to the next image.Here's a code snippet of converting images to an lmdb from a blog post by Gustav Larsson. In his example he constructs an lmdb of images and label pairs for image classification.
Loading the lmdb into your network:
This is done exactly like in the LeNet example. This Data layer at the beginning of the network prototxt that describes the LeNet model.
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
The source field is where you point caffe to the location of the lmdb you just created.
Something more related to performance and not critical to getting this to work is specifying how to normalize the input features. This is done through the transform_param
field. CK+ has fixed size images, so no need for resizing. One thing you do need though is normalize the grayscale values. You can do this through mean subtraction. A simple of doing this is to replace the value of transform_param:scale
with the mean value of the gray scale intensities in your CK+ dataset.