问题
I would like to use the wordnet
lemmatizer to lemmatize the words in a
> a<-c("He saw a see-saw on a sea shore", "she is feeling cold")
> a
[1] "He saw a see-saw on a sea shore" "she is feeling cold"
I convert a
into a corpus and do pre-processing steps (like stopword removal, lemmatization etc)
> a <- Corpus(VectorSource(a))
I wanted to do the lemmatization in the below way,
> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
> terms <- getIndexTerms("NOUN", 1, filter)
> sapply(terms, getLemma)
but I get this error
> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
Error in .jnew(paste("com.nexagis.jawbone.filter", type, sep = "."), word, :
java.lang.NoSuchMethodError: <init>
My idea is to lemmatize the whole corpus and not a single word, How can it be accomplished?
回答1:
Put you code in a loop, you can try something like this:
lapply(a,function(x){
x.filter <- getTermFilter("ExactMatchFilter", x, TRUE))
terms <- getIndexTerms("NOUN", 1, x.filter)
sapply(terms, getLemma)
})
来源:https://stackoverflow.com/questions/14952215/wordnet-lemmatizer-for-r