问题
I am trying to get twitter data and create a wordcloud but my code is giving error while creating TermDocumentMatrix. My code is as below
twitter_search_data <- searchTwitter(searchString = text_to_search
,n = 500)
twitter_search_text <- sapply(twitter_search_data
,function(x) x$getText())
twitter_search_corpus <- Corpus(VectorSource(twitter_search_text))
twitter_search_corpus <- tm_map(twitter_search_corpus, stripWhitespace, lazy = TRUE)
twitter_search_corpus <- tm_map(twitter_search_corpus, content_transformer(tolower), lazy = TRUE)
twitter_search_corpus <- tm_map(twitter_search_corpus, PlainTextDocument,lazy = TRUE)
twitter_search_corpus <- tm_map(twitter_search_corpus, removePunctuation, lazy = TRUE)
twitter_search_corpus <- tm_map(twitter_search_corpus, removeNumbers, lazy = TRUE)
twitter_search_corpus <- tm_map(twitter_search_corpus, removeWords, c("the", "this", "The", "This", stopwords('english')), lazy = TRUE)
twitter_search_corpus <- tm_map(twitter_search_corpus, stemDocument, lazy = TRUE)
# Create Document Term Matrix
tdm <- as.matrix(TermDocumentMatrix(twitter_search_corpus
,control=list(wordLengths=c(3,Inf))
))
There are no errors before creating TermDocumentMatrix. The error I get is as below
Warning in mclapply(x$content[i], function(d) tm_reduce(d, x$lazy$maps)) : scheduled core 1 encountered error in user code, all values of the job will be affected Warning in mclapply(unname(content(x)), termFreq, control) : scheduled core 1 encountered error in user code, all values of the job will be affected Warning: Error in UseMethod: no applicable method for 'meta' applied to an object of class "try-error" Stack trace (innermost first): 74: FUN
73: lapply
72: setNames
71: as.list.VCorpus
70: as.list
69: lapply
68: meta.VCorpus
67: meta
66: TermDocumentMatrix.VCorpus
65: TermDocumentMatrix
64: as.matrix
63: observeEventHandler
1: runApp
I have already added lazy = TRUE
and content_transformer(tolower)
but still the error is coming.
回答1:
The issue seems to be with placement of
twitter_search_corpus <- tm_map(twitter_search_corpus, stripWhitespace, lazy = TRUE)
After removing punctuation, numbers and words whitespaces were inserted in the text. So the above code to remove whitespaces need to be the last statement before creating TermDocumentMatrix.
来源:https://stackoverflow.com/questions/37088965/r-termdocumentmatrix-error-while-creating