lemmatization

How to inverse lemmatization process given a lemma and a token?

人走茶凉 提交于 2020-08-06 12:45:45
问题 Generally, in natural language processing, we want to get the lemma of a token. For example, we can map 'eaten' to 'eat' using wordnet lemmatization. Is there any tools in python that can inverse lemma to a certain form? For example, we map 'go' to 'gone' given target form 'eaten'. PS: Someone mentions we have to store such mappings. How to un-stem a word in Python? 回答1: Turning a base form such as a lemma into a situation-appropriate form is called realization (or "surface realization").

How to inverse lemmatization process given a lemma and a token?

独自空忆成欢 提交于 2020-08-06 12:45:17
问题 Generally, in natural language processing, we want to get the lemma of a token. For example, we can map 'eaten' to 'eat' using wordnet lemmatization. Is there any tools in python that can inverse lemma to a certain form? For example, we map 'go' to 'gone' given target form 'eaten'. PS: Someone mentions we have to store such mappings. How to un-stem a word in Python? 回答1: Turning a base form such as a lemma into a situation-appropriate form is called realization (or "surface realization").

finding the POS of the root of a noun_chunk with spacy

人盡茶涼 提交于 2020-06-27 06:06:29
问题 When using spacy you can easily loop across the noun_phrases of a text as follows: S='This is an example sentence that should include several parts and also make clear that studying Natural language Processing is not difficult' nlp = spacy.load('en_core_web_sm') doc = nlp(S) [chunk.text for chunk in doc.noun_chunks] # = ['an example sentence', 'several parts', 'Natural language Processing'] You can also get the "root" of the noun chunk: [chunk.root.text for chunk in doc.noun_chunks] # = [

English lemmatizer databases?

二次信任 提交于 2020-01-31 18:07:10
问题 Do you know any big enough lemmatizer database that returns correct result for following sample words: geese: goose plantes: //not found Wordnet's morphological analyzer is not sufficient, since it gives the following incorrect results: geese: //not found plantes: plant 回答1: MorphAdorner seems to be better at this, but it still finds the incorrect result for "plantes" plantes: plante geese: goose Maybe you'd like to use MorphAdorner to do the lemmatization, and then check its results against

Python NLTK Lemmatization of the word 'further' with wordnet

旧时模样 提交于 2020-01-09 05:33:50
问题 I'm working on a lemmatizer using python, NLTK and the WordNetLemmatizer. Here is a random text that output what I was expecting from nltk.stem import WordNetLemmatizer from nltk.corpus import wordnet lem = WordNetLemmatizer() lem.lemmatize('worse', pos=wordnet.ADJ) // here, we are specifying that 'worse' is an adjective Output: 'bad' lem.lemmatize('worse', pos=wordnet.ADV) // here, we are specifying that 'worse' is an adverb Output: 'worse' Well, everything here is fine. The behaviour is the