I want to use the svm classifier for facial expression detection. I know opencv has a svm api, but I have no clue what should be the input to train the classifier. I have read m
the machine learning algos in opencv all come with a similar interface. to train it, you pass a NxM Mat offeatures (N rows, each feature one row with length M) and a Nx1 Mat with the class-labels. like this:
//traindata //trainlabels
f e a t u r e 1
f e a t u r e -1
f e a t u r e 1
f e a t u r e 1
f e a t u r e -1
for the prediction, you fill a Mat with 1 row in the same way, and it will return the predicted label
so, let's say, your 16 facial points are stored in a vector, you would do like:
Mat trainData; // start empty
Mat labels;
for all facial_point_vecs:
{
for( size_t i=0; i<16; i++ )
{
trainData.push_back(point[i]);
}
labels.push_back(label); // 1 or -1
}
// now here comes the magic:
// reshape it, so it has N rows, each being a flat float, x,y,x,y,x,y,x,y... 32 element array
trainData = trainData.reshape(1, 16*2); // numpoints*2 for x,y
// we have to convert to float:
trainData.convertTo(trainData,CV_32F);
SVM svm; // params omitted for simplicity (but that's where the *real* work starts..)
svm.train( trainData, labels );
//later predict:
vector<Point> points;
Mat testData = Mat(points).reshape(1,32); // flattened to 1 row
testData.convertTo(testData ,CV_32F);
float p = svm.predict( testData );
Face gesture recognition is a widely researched problem, and the appropriate features you need to use can be found by a very thorough study of the existing literature. Once you have the feature descriptor you believe to be good, you go on to train the SVM with those. Once you have trained the SVM with optimal parameters (found through cross-validation), you start testing the SVM model on unseen data, and you report the accuracy. That, in general, is the pipeline.
Now the part about SVMs:
SVM is a binary classifier- it can differentiate between two classes (though it can be extended to multiple classes as well). OpenCV has an inbuilt module for SVM in the ML library. The SVM class has two functions to begin with: train(..)
and predict(..)
. To train the classifier, you give as in input a very large amount of sample feature descriptors, along with their class labels (usually -1 and +1). Remember the format OpenCV supports: every training sample has to be a row-vector. And each row will have one corresponding class label in the labels vector. So if you have a descriptor of length n
, and you have m
such sample descriptors, your training matrix would be m x n
(m
rows, each of length n
), and the labels vector would be of length m
. There is also a SVMParams
object that contains properties like SVM-type and values for parameters like C
that you'll have to specify.
Once trained, you extract features from an image, convert it into a single row format, and give to predict()
and it'll tell you which class it belongs to (+1 or -1).
There's also a train_auto()
with similar arguments with a similar format that gives you the optimum values of the SVM parameters.
Also check this detailed SO answer to see an example.
EDIT: Assuming you have a Feature Descriptor that returns a vector of features, the algorithm would be something like:
Mat trainingMat, labelsMat;
for each image in training database:
feature = extractFeatures( image[i] );
Mat feature_row = alignAsRow( feature );
trainingMat.push_back( feature_row );
labelsMat.push_back( -1 or 1 ); //depending upon class.
mySvmObject.train( trainingMat, labelsMat, Mat(), Mat(), mySvmParams );
I don't presume that extractFeatures()
and alignAsRow()
are existing functions, you might need to write them yourself.