How to inverse lemmatization process given a lemma and a token?

问题

Generally, in natural language processing, we want to get the lemma of a token.

For example, we can map 'eaten' to 'eat' using wordnet lemmatization.

Is there any tools in python that can inverse lemma to a certain form?

For example, we map 'go' to 'gone' given target form 'eaten'.

PS: Someone mentions we have to store such mappings. How to un-stem a word in Python?

回答1:

Turning a base form such as a lemma into a situation-appropriate form is called realization (or "surface realization"). Example from Wikipedia:

NPPhraseSpec subject = nlgFactory.createNounPhrase("the", "woman");
subject.setPlural(true);
SPhraseSpec sentence = nlgFactory.createClause(subject, "smoke");
sentence.setFeature(Feature.NEGATED, true);
System.out.println(realiser.realiseSentence(sentence));
// output: "The women do not smoke."

Libraries for this are not as frequently used as lemmatizers, which generally means you have fewer options and are less likely to find a well developed library. The Wikipedia example is in Java because the most popular library supporting this is SimpleNLG.

A quick search found pynlg, though it doesn't seem actively maintained. Alternately you can use SimpleNLG via an HTTP JSON interface provided by the Python library nlgserv.

来源：https://stackoverflow.com/questions/45590278/how-to-inverse-lemmatization-process-given-a-lemma-and-a-token

标签

python

nlp

nltk

lemmatization

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!