SciKit One-class SVM classifier training time increases exponentially with size of training data

后端未结

关注

 2  1739

野趣味 2021-01-14 12:52

I am using the Python SciKit OneClass SVM classifier to detect outliers in lines of text. The text is converted to numerical features first using bag of words and TF-IDF.

2条回答

南笙 (楼主)

2021-01-14 13:37

Hope I'm not too late. OCSVM, and SVM, is resource hungry, and the length/time relationship is quadratic (the numbers you show follow this). If you can, see if Isolation Forest or Local Outlier Factor work for you, but if you're considering applying on a lengthier dataset I would suggest creating a manual AD model that closely resembles the context of these off-the-shelf solutions. By doing this then you should be able to work either in parallel or with threads.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...