SciKit One-class SVM classifier training time increases exponentially with size of training data

后端 未结 2 1737
野趣味
野趣味 2021-01-14 12:52

I am using the Python SciKit OneClass SVM classifier to detect outliers in lines of text. The text is converted to numerical features first using bag of words and TF-IDF.

2条回答
  •  孤城傲影
    2021-01-14 13:20

    Well scikit's SVM is a high-level implementation so there is only so much you can do, and in terms of speed, from their website, "SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation."

    You can increase your kernel size parameter based on your available RAM, but this increase does not help much.

    You can try changing your kernel, though your model might be incorrect.

    Here is some advice from http://scikit-learn.org/stable/modules/svm.html#tips-on-practical-use: Scale your data.

    Otherwise, don't use scikit and implement it yourself using neural nets.

提交回复
热议问题