Stanford classifier cross validation averaged or aggregate metrics

天大地大妈咪最大 提交于 2020-01-23 17:21:12

问题


With Stanford Classifier it is possible to use cross validation by setting the options in the properties file, such as this for 10-fold cross validation:

crossValidationFolds=10
printCrossValidationDecisions=true
shuffleTrainingData=true
shuffleSeed=1

Running this will output, per fold, the various metrics, such as precision, recall, Accuracy/micro-averaged F1 and Macro-averaged F1.

Is there an option to get an averaged or otherwise aggregated score of all 10 Accuracy/micro-averaged F1 or all 10 Macro-averaged F1 as part of the output?

In Weka, by default the output after 10-fold cross validation includes averaged metrics over all folds. Is such an option also available in Stanford Classifier? Having a final precision, recall or F1 score available and optimizing the parameters against it like in Weka is very useful, and I would like to do this with Stanford Classifier. How?


回答1:


When I run with 10 folds, I am seeing that output. When I run this command:

java -cp "*" edu.stanford.nlp.classify.ColumnDataClassifier -prop examples/cheese2007.prop -crossValidationFolds 10

I see this in the output (after ### Fold 9)

[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - 181 examples in test set
[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - Cls 2: TP=109 FN=6 FP=7 TN=59; Acc 0.928 P 0.940 R 0.948 F1 0.944
[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - Cls 1: TP=59 FN=7 FP=6 TN=109; Acc 0.928 P 0.908 R 0.894 F1 0.901
[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - Accuracy/micro-averaged F1: 0.92818
[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - Macro-averaged F1: 0.92224 
[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - Average accuracy/micro-averaged F1: 0.93429
[main] INFO edu.stanford.nlp.classify.ColumnDataClassifier - Average macro-averaged F1: 0.92247


来源:https://stackoverflow.com/questions/36361348/stanford-classifier-cross-validation-averaged-or-aggregate-metrics

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!