rweka | 易学教程

rWeka how to calculate ROC AUC?

阅读更多关于 rWeka how to calculate ROC AUC?

问题 I am using rWeka package to compare the performance of different machine learning algorithms such as: # KNN: (resultIBk <- IBk(postScore~., data_train)) # Naive Bayes: NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayes") # Default settings Weka (resultNB <- NB(postScore~., data_train)) # Decision Tree J48 (resultJ48 <- J48(postScore~., data_train)) Can anyone please advise on how to calculate ROC AUC for the various machine learning algorithms in Weka? I understand that for

rWeka how to calculate ROC AUC?

阅读更多关于 rWeka how to calculate ROC AUC?

Java Application not running on OS X Yosemite

阅读更多关于 Java Application not running on OS X Yosemite

问题 After installing OS X Yosemite, one of my Java Applications stopped running. The message was that I needed JAVA SE 6 Runtime. Exactly as in Eclipse Kepler for OS X Mavericks request Java SE 6 I did as user Nikolas suggested and apparently had the same problem as user Sage commented: Initially, this gave me the Eclipse error Failed to create the Java Virtual Machine , but that was because my /usr/bin/java was symlinked to another 1.7 (the /Library/Internet/... plugins one instead of the

R and tm package: create a term-document matrix with a dictionary of one or two words?

阅读更多关于 R and tm package: create a term-document matrix with a dictionary of one or two words?

问题 Purpose: I want to create a term-document matrix using a dictionary which has compound words, or bigrams , as some of the keywords . Web Search: Being new to text-mining and the tm package in R , I went to the web to figure out how to do this. Below are some relevant links that I found: FAQS on the tm-package website finding 2 & 3 word phrases using r tm package counter ngram with tm package in r findassocs for multiple terms in r Background: Of these, I preferred the solution that uses

Save/load a M5 RWeka caret model fails

阅读更多关于 Save/load a M5 RWeka caret model fails

问题 I'm coming up with an error after loading a saved M5 implementation of the RWeka package via Caret. Error in .jcall(o, "Ljava/lang/Class;", "getClass") : RcallMethod: attempt to call a method of a NULL object. To reproduce the error: library(caret); library(RWeka) data(GermanCredit) myModel <- train(Duration~Amount, data=GermanCredit, method="M5") predict(myModel, GermanCredit[1,]) # Works. save(myModel, file="myModel.rda") load("myModel.rda") predict(myModel, GermanCredit[1,]) # Produces the

2-gram and 3-gram instead of 1-gram using RWeka

阅读更多关于 2-gram and 3-gram instead of 1-gram using RWeka

问题 I am trying to extract 1-gram, 2-gram and 3-gram from the train corpus, using RWeka NGramTokenizer function. Unfortunately, getting only 1-grams. There is my code: train_corpus # clean-up cleanset1<- tm_map(train_corpus, tolower) cleanset2<- tm_map(cleanset1, removeNumbers) cleanset3<- tm_map(cleanset2, removeWords, stopwords("english")) cleanset4<- tm_map(cleanset3, removePunctuation) cleanset5<- tm_map(cleanset4, stemDocument, language="english") cleanset6<- tm_map(cleanset5,

2-gram and 3-gram instead of 1-gram using RWeka

阅读更多关于 2-gram and 3-gram instead of 1-gram using RWeka

I am trying to extract 1-gram, 2-gram and 3-gram from the train corpus, using RWeka NGramTokenizer function. Unfortunately, getting only 1-grams. There is my code: train_corpus # clean-up cleanset1<- tm_map(train_corpus, tolower) cleanset2<- tm_map(cleanset1, removeNumbers) cleanset3<- tm_map(cleanset2, removeWords, stopwords("english")) cleanset4<- tm_map(cleanset3, removePunctuation) cleanset5<- tm_map(cleanset4, stemDocument, language="english") cleanset6<- tm_map(cleanset5, stripWhitespace) # 1-gram NgramTokenizer1 <- function(x) NGramTokenizer(x, Weka_control(min = 1, max = 1)) train_dtm

R and tm package: create a term-document matrix with a dictionary of one or two words?

阅读更多关于 R and tm package: create a term-document matrix with a dictionary of one or two words?

Purpose: I want to create a term-document matrix using a dictionary which has compound words, or bigrams , as some of the keywords . Web Search: Being new to text-mining and the tm package in R , I went to the web to figure out how to do this. Below are some relevant links that I found: FAQS on the tm-package website finding 2 & 3 word phrases using r tm package counter ngram with tm package in r findassocs for multiple terms in r Background: Of these, I preferred the solution that uses NGramTokenizer in the RWeka package in R , but I ran into a problem . In the example code below, I create

Creating N-Grams with tm & RWeka - works with VCorpus but not Corpus

阅读更多关于 Creating N-Grams with tm & RWeka - works with VCorpus but not Corpus

Following the many guides to creating biGrams using the 'tm' and 'RWeka' packages, I was getting frustrated that only 1-Grams were being returned in the tdm . Through much trial and error I discovered that proper function was achieved using ' VCorpus ' but not using ' Corpus '. BTW, I'm pretty sure this was working with 'Corpus' ~1 month ago but it is not now. R (3.3.3), RTools (3.4), RStudio (1.0.136) and all packages (tm 0.7-1, RWeka 0.4-31) have been updated to the latest. I would appreciate any insight on what this won't work with Corpus and if others have this same problem. #A

Creating N-Grams with tm & RWeka - works with VCorpus but not Corpus

阅读更多关于 Creating N-Grams with tm & RWeka - works with VCorpus but not Corpus

问题 Following the many guides to creating biGrams using the 'tm' and 'RWeka' packages, I was getting frustrated that only 1-Grams were being returned in the tdm . Through much trial and error I discovered that proper function was achieved using ' VCorpus ' but not using ' Corpus '. BTW, I'm pretty sure this was working with 'Corpus' ~1 month ago but it is not now. R (3.3.3), RTools (3.4), RStudio (1.0.136) and all packages (tm 0.7-1, RWeka 0.4-31) have been updated to the latest. I would