Understanding num_classes for xgboost in R

后端 未结 4 1542
没有蜡笔的小新
没有蜡笔的小新 2021-01-17 10:11

I\'m having a lot of trouble figuring out how to correctly set the num_classes for xgboost.

I\'ve got an example using the Iris data

df <- iris

         


        
相关标签:
4条回答
  • 2021-01-17 10:27

    I ran into this rather weird problem as well. It seemed in my class to be a result of not properly encoding the labels.

    First, using a string vector with N classes as the labels, I could only get the algorithm to run by setting num_class = N + 1. However, this result was useless, because I only had N actual classes and N+1 buckets of predicted probabilities.

    I re-encoded the labels as integers and then num_class worked fine when set to N.

    # Convert classes to integers for xgboost
    class <- data.table(interest_level=c("low", "medium", "high"), class=c(0,1,2))
    t1    <- merge(t1, class, by="interest_level", all.x=TRUE, sort=F)
    

    and

    param <- list(booster="gbtree",
                  objective="multi:softprob",
                  eval_metric="mlogloss",
                  #nthread=13,
                  num_class=3,
                  eta_decay = .99,
                  eta = .005,
                  gamma = 1,
                  max_depth = 4,
                  min_child_weight = .9,#1,
                  subsample = .7,
                  colsample_bytree = .5
    )
    

    For example.

    0 讨论(0)
  • 2021-01-17 10:38

    I was seeing the same error, my issue was that I was using an eval_metric that was only meant to be used for multiclass labels when my data had binary labels. See eval_metric in the Learning Class Parameters section of the XGBoost docs for a list of all of the options.

    0 讨论(0)
  • 2021-01-17 10:39

    I had this problem and it turned out that I was trying to subtract 1 from my predictor which was already in the units of 0 and 1. Probably a novice mistake, but in case anyone else is running into this with a binary response variable that is already 0 and 1 it is something to make note of.

    Tutorial said:

    label = as.integer(iris$Species)-1
    

    What worked for me (response is high_end):

    label = as.integer(high_end)
    
    0 讨论(0)
  • 2021-01-17 10:46

    label must be in [0, num_class) in your script add y<-y-1 before model <-...

    0 讨论(0)
提交回复
热议问题