问题
I'm doing sentiment analysis from twitter but my tweets are on Spanish so I can't use tidytext to classify the words. Does anyone know if there is a similar package for Spanish?
回答1:
There are not a lot of good open source options for sentiment lexicons in non-English languages right now, unfortunately. You can request the NRC lexicon in other languages from the authors; it is translated by Google Translate (which of course adds uncertainty but has shown to be mostly OK overall) and the authors say they give it away for research purposes but will charge for commercial use.
回答2:
I run into the same issue with Non-English textmining. I found udpipe
which is an r package developed by Bnosac. It is a Natural Language Processing toolkit that provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization', 'morphological feature tagging' and 'dependency parsing' of raw text. Beware that there are no sentiment tags in the package. Those you will need to find elsewhere.
It supports a diverse range of non-English languages.
You can find out more on their blog, on the webpage of udpipe or on github
P.S. I have no affiliation with them.
来源:https://stackoverflow.com/questions/47075188/tidytext-r-in-spanish-any-alternative