tidytext R in spanish - any alternative?

家住魔仙堡 提交于 2020-01-11 09:32:06

问题


I'm doing sentiment analysis from twitter but my tweets are on Spanish so I can't use tidytext to classify the words. Does anyone know if there is a similar package for Spanish?


回答1:


There are not a lot of good open source options for sentiment lexicons in non-English languages right now, unfortunately. You can request the NRC lexicon in other languages from the authors; it is translated by Google Translate (which of course adds uncertainty but has shown to be mostly OK overall) and the authors say they give it away for research purposes but will charge for commercial use.




回答2:


I run into the same issue with Non-English textmining. I found udpipe which is an r package developed by Bnosac. It is a Natural Language Processing toolkit that provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization', 'morphological feature tagging' and 'dependency parsing' of raw text. Beware that there are no sentiment tags in the package. Those you will need to find elsewhere.

It supports a diverse range of non-English languages.

You can find out more on their blog, on the webpage of udpipe or on github

P.S. I have no affiliation with them.



来源:https://stackoverflow.com/questions/47075188/tidytext-r-in-spanish-any-alternative

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!