Error when I try to predict class probabilities in R - caret

后端 未结 5 859
隐瞒了意图╮
隐瞒了意图╮ 2020-12-09 08:38

I\'ve build a model using caret. When the training was completed I got the following warning:

Warning message: In train.default(x, y, weights = w, ...) : A

相关标签:
5条回答
  • 2020-12-09 09:05

    I have read through the answers above while facing a similar problem. A formal solution is to do this on the train and test datasets. Make sure you include the response variable in the feature.names too.

    feature.names=names(train)
    
    for (f in feature.names) {
      if (class(train[[f]])=="factor") {
        levels <- unique(c(train[[f]]))
        train[[f]] <- factor(train[[f]],
                       labels=make.names(levels))
      }
    }
    

    This creates syntactically correct labels for all factors.

    0 讨论(0)
  • 2020-12-09 09:08

    As stated above the class values must be factors and must be valid names. Another way to insure this is,

    levels(all.dat$target) <- make.names(levels(factor(all.dat$target)))
    
    0 讨论(0)
  • 2020-12-09 09:26

    As @Sam Firke already pointed out in comments (but I overlooked it) levels TRUE/FALSE also don't work. So I converted them to yes/no.

    0 讨论(0)
  • 2020-12-09 09:28

    As per the above example, usually refactoring the outcome variable will fix the problem. It's better to change in the original dataset before partitioning into training and test datasets

    levels <- unique(data$outcome) data$outcome <- factor(data$outcome, labels=make.names(levels))

    As others pointed out earlier, this problem only occurs when classProbs=TRUE which causes the train function to generate additional statistics related to the outcome class

    0 讨论(0)
  • 2020-12-09 09:30

    The answer is in bold at the top of your post =]

    What are you modeling? Is it alchemy_category? The code only says formula and we can't see it.

    When you ask for class probabilities, model predictions are a data frame with separate columns for each class/level. If alchemy_category doesn't have levels that are valid column names, data.frame converts then to valid names. That creates a problem because the code is looking for a specific name but the data frame as a different (but valid) name.

    For example, if I had

    > test <- factor(c("level1", "level 2")) 
    > levels(test)
    [1] "level 2" "level1" 
    > make.names(levels(test))
    [1] "level.2" "level1"
    

    the code would be looking for "level 2" but there is only "level.2".

    0 讨论(0)
提交回复
热议问题