Can I use XGBoost to boost other models (eg. Naive Bayes, Random Forest)?

前端 未结 1 814
终归单人心
终归单人心 2020-12-22 12:11

I am working on a fraud analytics project and I need some help with boosting. Previously, I used SAS Enterprise Miner to learn more about boosting/ensemble techniques and I

相关标签:
1条回答
  • 2020-12-22 12:41

    In theory, boosting any (base) classifier is easy and straightforward with scikit-learn's AdaBoostClassifier. E.g. for a Naive Bayes classifier, it should be:

    from sklearn.ensemble import AdaBoostClassifier
    from sklearn.naive_bayes import GaussianNB
    
    nb = GaussianNB()
    model = AdaBoostClassifier(base_estimator=nb, n_estimators=10)
    model.fit(X_train, y_train)
    

    and so on.

    In practice, we never use Naive Bayes or Neural Nets as base classifiers for boosting (let alone Random Forests, which are themselves an ensemble method).

    Adaboost (and similar boosting methods that have been derived afterwards, like GBM and XGBoost) was conceived using decision trees (DTs) as base classifiers (more specifically, decision stumps, i.e. DTs with a depth of only 1); there is good reason why still today, if you don't specify explicitly the base_classifier argument in scikit-learn's AdaBoostClassifier above, it assumes a value of DecisionTreeClassifier(max_depth=1), i.e. a decision stump.

    DTs are suitable for such ensembling because they are essentially unstable classifiers, which is not the case with the other algorithms mentioned, hence the latter are not expected to offer anything when used as base classifiers for boosting algorithms.

    0 讨论(0)
提交回复
热议问题