Some NLP stuff to do with grammar, tagging, stemming, and word sense disambiguation in Python

后端 未结 1 1812
旧巷少年郎
旧巷少年郎 2021-02-03 10:59

Background (TLDR; provided for the sake of completion)

Seeking advice on an optimal solution to an odd requirement. I\'m a (literature) student in my

相关标签:
1条回答
  • 2021-02-03 11:09

    I think that the comment above on n-gram language model fits your requirements better than parsing and tagging. Parsers and taggers (unless modified) will suffer from the lack of right context of the target word (i.e., you don't have the rest of the sentence available at time of query). On the other hand, language models consider the past (left context) efficiently, especially for windows up to 5 words. The problem with n-grams is that they don't model long distance dependencies (more than n words).

    NLTK has a language model: http://nltk.googlecode.com/svn/trunk/doc/api/nltk.model.ngram-pysrc.html . A tag lexicon may help you smooth the model more.

    The steps as I see them: 1. Get a set of words from the users. 2. Create a larger set of all possible inflections of the words. 3. Ask the model which inflected word is most probable.

    0 讨论(0)
提交回复
热议问题