R caret naïve bayes accuracy is null

问题

I have one dataset to train with SVM and Naïve Bayes. SVM works, but Naïve Bayes doesn't work. Follow de source code below:

library(tools)
library(caret)
library(doMC)
library(mlbench)
library(magrittr)
library(caret)

CORES <- 5 #Optional
registerDoMC(CORES) #Optional

load("chat/rdas/2gram-entidades-erro.Rda")

set.seed(10)
split=0.60

maFinal$resposta <- as.factor(maFinal$resposta)
data_train <- as.data.frame(unclass(maFinal[ trainIndex,]))
data_test <- maFinal[-trainIndex,]

treegram25NotNull <- train(x = subset(data_train, select = -c(resposta)),
      y = data_train$resposta, 
      method = "nb",
      trControl = trainControl(method = "cv", number = 5, savePred=T, sampling = "up"))

treegram25NotNull

The final accuracy is null

Warning messages: 1: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures. 2: In train.default(subset(data_train, select = -c(resposta)), data_train$resposta, : missing values found in aggregated results

Any help would be greatly appreciated, thanks.

回答1:

The fix is really simple:

set.seed(10)
split <- 0.60
maFinal[] <- lapply(maFinal, as.factor)

Currently all your variables, except for resposta, are numeric. However, they have only up to 12~ distinct values, meaning that they all actually should be factor variables. Also, many of them are highly unbalanced. Then, when splitting the sample, the issue arises from treating (actually factor) variables with only a single unique value as continuous variables.

来源：https://stackoverflow.com/questions/53977563/r-caret-na%c3%afve-bayes-accuracy-is-null

标签

r-caret

naivebayes