Wordnet (Word Sense Annotated) Corpus

爱⌒轻易说出口 提交于 2019-12-09 13:26:28

问题


I've been utilizing lots of different corpora for natural language processing, and I've been looking for a corpus that has been annotated with Wordnet Word Senses.

I understand that there probably is not a big corpus with this information, since the corpus needs to be built up manually, but there has to be something to go off of.

Also if there isn't a corpus in existence, is there at least a sense annotated ngram database (with what percentage of the time a word is each of its definitions, or a numerical count of each wordnet definition depending on how common the word sense is)?


回答1:


Three prominent corpora annotated for WordNet:

  • MASC
  • WordNet gloss
  • SemCor



回答2:


Some of the SENSEVAL (now SEMEVAL) data is annotated with WordNet.




回答3:


you can use senseval2, for java there is a semcor format and (jSemcor API) and also senseval3. these two corpus are used for Word sense disambiguation.



来源:https://stackoverflow.com/questions/8822746/wordnet-word-sense-annotated-corpus

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!