I am new to spacy and I want to use its lemmatizer function, but I don\'t know how to use it, like I into strings of word, which will return the string with the basic form the w
If you want to use just the Lemmatizer, you can do that in the following way:
from spacy.lemmatizer import Lemmatizer
from spacy.lang.en import LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES
lemmatizer = Lemmatizer(LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES)
lemmas = lemmatizer(u'ducks', u'NOUN')
print(lemmas)
Output
['duck']
Update
Since spacy version 2.2, LEMMA_INDEX, LEMMA_EXC, and LEMMA_RULES have been bundled into a Lookups Object:
import spacy
nlp = spacy.load('en')
nlp.vocab.lookups
>>>
nlp.vocab.lookups.tables
>>> ['lemma_lookup', 'lemma_rules', 'lemma_index', 'lemma_exc']
You can still use the lemmatizer directly with a word and a POS (part of speech) tag:
from spacy.lemmatizer import Lemmatizer, ADJ, NOUN, VERB
lemmatizer = nlp.vocab.morphology.lemmatizer
lemmatizer('ducks', NOUN)
>>> ['duck']
You can pass the POS tag as the imported constant like above or as string:
lemmatizer('ducks', 'NOUN')
>>> ['duck']
from spacy.lemmatizer import Lemmatizer, ADJ, NOUN, VERB