comparing synonyms NLTK [duplicate]

为君一笑 提交于 2019-11-28 09:30:21

问题


I can't come up with a stranger problem, guess you'll help me.

for p in wn.synsets('change'):<br>
    print(p)

Getting:

Synset('change.n.01')
Synset('change.n.02')
Synset('change.n.03')
Synset('change.n.04')
Synset('change.n.05')
Synset('change.n.06')
Synset('change.n.07')
Synset('change.n.08')
Synset('change.n.09')
Synset('variety.n.06')
Synset('change.v.01')
Synset('change.v.02')
Synset('change.v.03')
Synset('switch.v.03')
Synset('change.v.05')
Synset('change.v.06')
Synset('exchange.v.01')
Synset('transfer.v.06')
Synset('deepen.v.04')
Synset('change.v.10')

For example I have an a string:

a = 'transfer'

I'd like to be able to identify all kinds of synonyms of word 'change' and know f.e. 'transfer' is the one of them. How can I ask my program: "Is 'transfer' is one of the synonyms of 'change'?"


回答1:


Firstly, wordnet indexes concepts (aka Synsets) and link possible words for each concept, the following code shows the concepts link to the word 'change':

>>> from nltk.corpus import wordnet as wn
>>> wn.synsets('change')
[Synset('change.n.01'), Synset('change.n.02'), Synset('change.n.03'), Synset('change.n.04'), Synset('change.n.05'), Synset('change.n.06'), Synset('change.n.07'), Synset('change.n.08'), Synset('change.n.09'), Synset('variety.n.06'), Synset('change.v.01'), Synset('change.v.02'), Synset('change.v.03'), Synset('switch.v.03'), Synset('change.v.05'), Synset('change.v.06'), Synset('exchange.v.01'), Synset('transfer.v.06'), Synset('deepen.v.04'), Synset('change.v.10')]

A synset has several properties, it has:

  • ID number
  • Part-of-Speech label
  • definition
  • lemma names, i.e. the possible words that can be used to instantiate the concept
  • links to other synset by N-nymy relations (e.g. hypernym, hyponym, meronym)

Here's how to interface the above properties in NLTK:

>>> wn.synsets('change')[0]
Synset('change.n.01')
>>> wn.synsets('change')[0].offset()
7296428
>>> wn.synsets('change')[0].pos()
u'n'
>>> wn.synsets('change')[0].definition()
u'an event that occurs when something passes from one state or phase to another'
>>> wn.synsets('change')[0].lemma_names()
[u'change', u'alteration', u'modification']
>>> wn.synsets('change')[0].hypernyms()
[Synset('happening.n.01')]

But a synset doesn't necessary have synonym relations. If we define synonyms as words that have similar meaning, it is the words (i.e. lemmas) that have synonymy relations. In addition, the context of the words defines whether a word is a synonym of another. A single word has limited meaning, it's the "concept" that contains meaning and instantiate the meaning through human words. At least that's the typical theory of semantics, see chapter 2 in http://goo.gl/ZHzlNF

So when you want to ask is 'transfer' a synonym of 'change', you have to first:

  • define/select the concept you're referring to here and provide the context where 'transfer' is used, google Word Sense Disambiguation
  • define which concept of change are you referring to.

Then comparison of meaning is possible.

See also:

  • All synonyms for word in python?
  • How to get synonyms from nltk WordNet Python



回答2:


You need to first get the lemmas then iterate over your lemmas and get the names then check the membership with in operand:

>>> a in [j.name() for i in wn.synsets('change') for j in i.lemmas()]
True

>>> [j.name() for i in wn.synsets('change') for j in i.lemmas()]
[u'change', u'alteration', u'modification', u'change', u'change', u'change', u'change', u'change', u'change', u'change', u'change', u'variety', u'change', u'change', u'alter', u'modify', u'change', u'change', u'alter', u'vary', u'switch', u'shift', u'change', u'change', u'change', u'exchange', u'commute', u'convert', u'exchange', u'change', u'interchange', u'transfer', u'change', u'deepen', u'change', u'change']



回答3:


wn.synsets gives you the list of meanings, each meaning has a list of words.

for sense in wn.synsets('change'):
    if "transfer" in sense.lemma_names:
        print "'transfer' is synonym to 'change'"
        break



回答4:


Those are different senses of the word. you can obtain synonyms of each sense using synset('xxx').lemma_names. Then you can compare if the word is present in them.



来源:https://stackoverflow.com/questions/29476963/comparing-synonyms-nltk

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!