发表新帖

发表新帖

How to cluster similar sentences using BERT

前端未结

关注

 4  2059

难免孤独 2021-02-05 19:18

For ElMo, FastText and Word2Vec, I\'m averaging the word embeddings within a sentence and using HDBSCAN/KMeans clustering to group similar sentences.

A good example of t

4条回答

北恋 (楼主)

2021-02-05 20:06

You can use Sentence Transformers to generate the sentence embeddings. These embeddings are much more meaningful as compared to the one obtained from bert-as-service, as they have been fine-tuned such that semantically similar sentences have higher similarity score. You can use FAISS based clustering algorithm if number of sentences to be clustered are in millions or more as vanilla K-means like clustering algorithm takes quadratic time.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题