I want to perform multi-class classification using the svm
function of e1071
package. But from what I came to know from the documentation of svm<
The iris dataset contains three class labels: "Iris setosa", "Iris virginica" and "Iris versicolor". To employ a balanced one-against-one classification strategy with svm, you could train three binary classifiers:
The first classifier's training set only contains the "Iris setosa" and "Iris virginica" instances. The second classifier's training set only contains the "Iris setosa" and the "Iris versicolor" instances. The third classifier's training set--I guess by now you'll know already--contains only the "Iris virginica" and the "Iris versicolor" instances.
To classify an unknown instance, you apply all three classifiers. A simple voting strategy could then select the most frequently assigned class label, a more sophisticated may also consider the svm confidence scores for each assigned class label.
Edit (This principle works out of the box with svm
):
# install.packages( 'e1071' )
library( 'e1071' )
data( iris )
model <- svm( iris$Species~., iris )
res <- predict( model, newdata=iris )
R document says that "For multiclass-classification with k levels, k>2, libsvm uses the ‘one-against-one’-approach, in which k(k-1)/2 binary classifiers are trained; the appropriate class is found by a voting scheme."