rweka

rWeka how to calculate ROC AUC?

坚强是说给别人听的谎言 提交于 2020-01-16 16:49:30
问题 I am using rWeka package to compare the performance of different machine learning algorithms such as: # KNN: (resultIBk <- IBk(postScore~., data_train)) # Naive Bayes: NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayes") # Default settings Weka (resultNB <- NB(postScore~., data_train)) # Decision Tree J48 (resultJ48 <- J48(postScore~., data_train)) Can anyone please advise on how to calculate ROC AUC for the various machine learning algorithms in Weka? I understand that for

rWeka how to calculate ROC AUC?

守給你的承諾、 提交于 2020-01-16 16:49:17
问题 I am using rWeka package to compare the performance of different machine learning algorithms such as: # KNN: (resultIBk <- IBk(postScore~., data_train)) # Naive Bayes: NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayes") # Default settings Weka (resultNB <- NB(postScore~., data_train)) # Decision Tree J48 (resultJ48 <- J48(postScore~., data_train)) Can anyone please advise on how to calculate ROC AUC for the various machine learning algorithms in Weka? I understand that for

Java Application not running on OS X Yosemite

流过昼夜 提交于 2019-12-11 08:10:16
问题 After installing OS X Yosemite, one of my Java Applications stopped running. The message was that I needed JAVA SE 6 Runtime. Exactly as in Eclipse Kepler for OS X Mavericks request Java SE 6 I did as user Nikolas suggested and apparently had the same problem as user Sage commented: Initially, this gave me the Eclipse error Failed to create the Java Virtual Machine , but that was because my /usr/bin/java was symlinked to another 1.7 (the /Library/Internet/... plugins one instead of the

R and tm package: create a term-document matrix with a dictionary of one or two words?

谁说我不能喝 提交于 2019-12-09 07:01:59
问题 Purpose: I want to create a term-document matrix using a dictionary which has compound words, or bigrams , as some of the keywords . Web Search: Being new to text-mining and the tm package in R , I went to the web to figure out how to do this. Below are some relevant links that I found: FAQS on the tm-package website finding 2 & 3 word phrases using r tm package counter ngram with tm package in r findassocs for multiple terms in r Background: Of these, I preferred the solution that uses

Save/load a M5 RWeka caret model fails

﹥>﹥吖頭↗ 提交于 2019-12-07 18:31:00
问题 I'm coming up with an error after loading a saved M5 implementation of the RWeka package via Caret. Error in .jcall(o, "Ljava/lang/Class;", "getClass") : RcallMethod: attempt to call a method of a NULL object. To reproduce the error: library(caret); library(RWeka) data(GermanCredit) myModel <- train(Duration~Amount, data=GermanCredit, method="M5") predict(myModel, GermanCredit[1,]) # Works. save(myModel, file="myModel.rda") load("myModel.rda") predict(myModel, GermanCredit[1,]) # Produces the

2-gram and 3-gram instead of 1-gram using RWeka

杀马特。学长 韩版系。学妹 提交于 2019-12-07 13:48:17
问题 I am trying to extract 1-gram, 2-gram and 3-gram from the train corpus, using RWeka NGramTokenizer function. Unfortunately, getting only 1-grams. There is my code: train_corpus # clean-up cleanset1<- tm_map(train_corpus, tolower) cleanset2<- tm_map(cleanset1, removeNumbers) cleanset3<- tm_map(cleanset2, removeWords, stopwords("english")) cleanset4<- tm_map(cleanset3, removePunctuation) cleanset5<- tm_map(cleanset4, stemDocument, language="english") cleanset6<- tm_map(cleanset5,

2-gram and 3-gram instead of 1-gram using RWeka

梦想的初衷 提交于 2019-12-05 18:58:48
I am trying to extract 1-gram, 2-gram and 3-gram from the train corpus, using RWeka NGramTokenizer function. Unfortunately, getting only 1-grams. There is my code: train_corpus # clean-up cleanset1<- tm_map(train_corpus, tolower) cleanset2<- tm_map(cleanset1, removeNumbers) cleanset3<- tm_map(cleanset2, removeWords, stopwords("english")) cleanset4<- tm_map(cleanset3, removePunctuation) cleanset5<- tm_map(cleanset4, stemDocument, language="english") cleanset6<- tm_map(cleanset5, stripWhitespace) # 1-gram NgramTokenizer1 <- function(x) NGramTokenizer(x, Weka_control(min = 1, max = 1)) train_dtm

R and tm package: create a term-document matrix with a dictionary of one or two words?

本秂侑毒 提交于 2019-12-03 08:55:08
Purpose: I want to create a term-document matrix using a dictionary which has compound words, or bigrams , as some of the keywords . Web Search: Being new to text-mining and the tm package in R , I went to the web to figure out how to do this. Below are some relevant links that I found: FAQS on the tm-package website finding 2 & 3 word phrases using r tm package counter ngram with tm package in r findassocs for multiple terms in r Background: Of these, I preferred the solution that uses NGramTokenizer in the RWeka package in R , but I ran into a problem . In the example code below, I create

Creating N-Grams with tm & RWeka - works with VCorpus but not Corpus

允我心安 提交于 2019-11-30 07:31:15
Following the many guides to creating biGrams using the 'tm' and 'RWeka' packages, I was getting frustrated that only 1-Grams were being returned in the tdm . Through much trial and error I discovered that proper function was achieved using ' VCorpus ' but not using ' Corpus '. BTW, I'm pretty sure this was working with 'Corpus' ~1 month ago but it is not now. R (3.3.3), RTools (3.4), RStudio (1.0.136) and all packages (tm 0.7-1, RWeka 0.4-31) have been updated to the latest. I would appreciate any insight on what this won't work with Corpus and if others have this same problem. #A

Creating N-Grams with tm & RWeka - works with VCorpus but not Corpus

孤人 提交于 2019-11-29 09:40:12
问题 Following the many guides to creating biGrams using the 'tm' and 'RWeka' packages, I was getting frustrated that only 1-Grams were being returned in the tdm . Through much trial and error I discovered that proper function was achieved using ' VCorpus ' but not using ' Corpus '. BTW, I'm pretty sure this was working with 'Corpus' ~1 month ago but it is not now. R (3.3.3), RTools (3.4), RStudio (1.0.136) and all packages (tm 0.7-1, RWeka 0.4-31) have been updated to the latest. I would