R caret naïve bayes accuracy is null

岁酱吖の 提交于 2019-12-11 13:28:56

问题


I have one dataset to train with SVM and Naïve Bayes. SVM works, but Naïve Bayes doesn't work. Follow de source code below:

library(tools)
library(caret)
library(doMC)
library(mlbench)
library(magrittr)
library(caret)

CORES <- 5 #Optional
registerDoMC(CORES) #Optional

load("chat/rdas/2gram-entidades-erro.Rda")

set.seed(10)
split=0.60

maFinal$resposta <- as.factor(maFinal$resposta)
data_train <- as.data.frame(unclass(maFinal[ trainIndex,]))
data_test <- maFinal[-trainIndex,]

treegram25NotNull <- train(x = subset(data_train, select = -c(resposta)),
      y = data_train$resposta, 
      method = "nb",
      trControl = trainControl(method = "cv", number = 5, savePred=T, sampling = "up"))

treegram25NotNull

The final accuracy is null

Warning messages: 1: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures. 2: In train.default(subset(data_train, select = -c(resposta)), data_train$resposta, : missing values found in aggregated results

Any help would be greatly appreciated, thanks.


回答1:


The fix is really simple:

set.seed(10)
split <- 0.60
maFinal[] <- lapply(maFinal, as.factor)

Currently all your variables, except for resposta, are numeric. However, they have only up to 12~ distinct values, meaning that they all actually should be factor variables. Also, many of them are highly unbalanced. Then, when splitting the sample, the issue arises from treating (actually factor) variables with only a single unique value as continuous variables.



来源:https://stackoverflow.com/questions/53977563/r-caret-na%c3%afve-bayes-accuracy-is-null

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!