问题
I am trying to get whether a word is in singular form or in plural form by using nltk pos_tag. But the results are not accurate.
So, I need a way to find how can get whether a word is in singular form or in plural form? moreover I need it without using any python package.
回答1:
For English, every word should somehow have a root lemma where the default plurality is singular.
Assuming that you have only nouns in your list, you can try this:
from nltk.stem import WordNetLemmatizer
wnl = WordNetLemmatizer()
def isplural(word):
lemma = wnl.lemmatize(word, 'n')
plural = True if word is not lemma else False
return plural, lemma
nounls = ['geese', 'mice', 'bars', 'foos', 'foo',
'families', 'family', 'dog', 'dogs']
for nn in nounls:
isp, lemma = isplural(nn)
print nn, lemma, isp
You will have a problem when word is out of wordnet, then you have to use more sophiscated classifier or finite state machines out of NLTK
.
回答2:
Assuming you want an English solution, you can do something similar to 2er0's solution a bit more directly with pattern-en:
from pattern.en import singularize
def isplural(pluralForm):
singularForm = singularize(pluralForm)
plural = True if pluralForm is not singularForm else False
return plural, singularForm
nounls = ['geese', 'mice', 'bars', 'foos', 'foo',
'families', 'family', 'dog', 'dogs']
for pluralForm in nounls:
isp, singularForm = isplural(pluralForm)
print pluralForm, singularForm, isp
which outputs
geese goose True
mice mouse True
bars bar True
foos foo True
foo foo False
families family True
family family False
dog dog False
dogs dog True
the only difference in output between 2er0's solution and this is
foos foo True
since his solution outputs False
, as he pointed out since foos
is not in wordnet (and not an English word at all).
来源:https://stackoverflow.com/questions/18911589/how-to-test-whether-a-word-is-in-singular-form-or-not-in-python