Multiclass classification with xgboost classifier?

前端 未结 3 782
别跟我提以往
别跟我提以往 2021-02-14 17:55

I am trying out multi-class classification with xgboost and I\'ve built it using this code,

clf = xgb.XGBClassifier(max_depth=7, n_estimators=1000)

clf.fit(byte         


        
相关标签:
3条回答
  • 2021-02-14 18:23

    By default,XGBClassifier or many Classifier uses objective as binary but what it does internally is classifying (one vs rest) i.e. if you have 3 classes it will give result as (0 vs 1&2).If you're dealing with more than 2 classes you should always use softmax.Softmax turns logits into probabilities which will sum to 1.On basis of this,it makes the prediction which classes has the highest probabilities.As you can see the complexity increase as Saurabh mentioned in his answer so it will take more time.

    0 讨论(0)
  • 2021-02-14 18:24

    In fact, even if the default obj parameter of XGBClassifier is binary:logistic, it will internally judge the number of class of label y. When the class number is greater than 2, it will modify the obj parameter to multi:softmax.

    https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/sklearn.py

    class XGBClassifier(XGBModel, XGBClassifierBase):
        # pylint: disable=missing-docstring,invalid-name,too-many-instance-attributes
        def __init__(self, objective="binary:logistic", **kwargs):
            super().__init__(objective=objective, **kwargs)
    
        def fit(self, X, y, sample_weight=None, base_margin=None,
                eval_set=None, eval_metric=None,
                early_stopping_rounds=None, verbose=True, xgb_model=None,
                sample_weight_eval_set=None, callbacks=None):
            # pylint: disable = attribute-defined-outside-init,arguments-differ
    
            evals_result = {}
            self.classes_ = np.unique(y)
            self.n_classes_ = len(self.classes_)
    
            xgb_options = self.get_xgb_params()
    
            if callable(self.objective):
                obj = _objective_decorator(self.objective)
                # Use default value. Is it really not used ?
                xgb_options["objective"] = "binary:logistic"
            else:
                obj = None
    
            if self.n_classes_ > 2:
                # Switch to using a multiclass objective in the underlying
                # XGB instance
                xgb_options['objective'] = 'multi:softprob'
                xgb_options['num_class'] = self.n_classes_
    
    0 讨论(0)
  • 2021-02-14 18:33

    By default, XGBClassifier uses the objective='binary:logistic'. When you use this objective, it employs either of these strategies: one-vs-rest (also known as one-vs-all) and one-vs-one. It may not be the right choice for your problem at hand.

    When you use objective='multi:softprob', the output is a vector of number of data points * number of classes. As a result, there is an increase in time complexity of your code.

    Try setting objective=multi:softmax in your code. It is more apt for multi-class classification task.

    0 讨论(0)
提交回复
热议问题