How to use caffe convnet library to detect facial expressions?

前端 未结 1 923
生来不讨喜
生来不讨喜 2021-02-11 10:06

How can I use caffe convnet to detect facial expressions?

I have a image dataset, Cohn Kanade, and I want to train caffe convnet with this dataset. Caffe has a document

相关标签:
1条回答
  • 2021-02-11 10:47

    Caffe supports multiple formats for the input data (HDF5/lmdb/leveldb). It's just a matter of picking the one you feel most comfortable with. Here are a couple of options:

    1. caffe/build/tools/convert_imageset:

    convert_imageset is one of the command line tools you get from building caffe.

    Usage is along the lines of:

    • specifying a list of images and label pairs in a text file. 1 row per pair.
    • specifying where the images are located.
    • Choosing a backend db (which format). Default is lmdb which should be fine.

    You need to write up a text file where each line starts with the filename of the image followed by a scalar label (e.g. 0, 1, 2,...)

    1. Construct your lmdb in python using Caffe's Datum class:

    This requires building caffe's python interface. Here you write some python code that:

    • iterates through a list of images
    • loads the images into a numpy array.
    • Constructs a caffe Datum object
    • Assigns the image data to the Datum object.
    • The Datum class has a member called label you can set it to the AU class from your CK dataset, if that is what you want your network to classify.
    • Writes the Datum object to the db and moves on to the next image.

    Here's a code snippet of converting images to an lmdb from a blog post by Gustav Larsson. In his example he constructs an lmdb of images and label pairs for image classification.

    Loading the lmdb into your network:

    This is done exactly like in the LeNet example. This Data layer at the beginning of the network prototxt that describes the LeNet model.

    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        scale: 0.00390625
      }
      data_param {
        source: "examples/mnist/mnist_train_lmdb"
        batch_size: 64
        backend: LMDB
      }
    }
    

    The source field is where you point caffe to the location of the lmdb you just created.

    Something more related to performance and not critical to getting this to work is specifying how to normalize the input features. This is done through the transform_param field. CK+ has fixed size images, so no need for resizing. One thing you do need though is normalize the grayscale values. You can do this through mean subtraction. A simple of doing this is to replace the value of transform_param:scale with the mean value of the gray scale intensities in your CK+ dataset.

    0 讨论(0)
提交回复
热议问题