Methods for automated synonym detection

后端 未结 4 1607
忘了有多久
忘了有多久 2021-02-03 14:04

I am currently working on a neural network based approach to short document classification, and since the corpuses I am working with are usually around ten words, the standard s

4条回答
  •  死守一世寂寞
    2021-02-03 14:33

    The code here (http://ronan.collobert.com/senna/) implements a neural network to perform a variety on NLP tasks. The page also links to a paper that describes one of the most successful approaches so far of applying convolutional neural nets to NLP tasks.

    It is possible to modify their code to use the trained networks that they provide to classify sentences, but this may take more work than you were hoping for, and it can be tricky to correctly train neural networks.

    I had a lot of success using a similar technique to classify biological sequences, but, in contrast to English language sentences, my sequences had only 20 possible symbols per position rather than 50-100k.

    One interesting feature of their network that may be useful to you is their word embeddings. Word embeddings map individual words (each can be considered an indicator vector of length 100k) to real valued vectors of length 50. Euclidean distance between the embedded vectors should reflect semantic distance between words, so this could help you detect synonyms.

    For a simpler approach WordNet (http://wordnet.princeton.edu/) provides lists of synonyms, but I have never used this myself.

提交回复
热议问题