Fast keyword extraction in elasticsearch

给你一囗甜甜゛ 提交于 2019-12-08 01:01:07

问题


I have large database of annotations of images stored in an elasticsearch database. I want to use this database for keyword extraction. Input is text (typically a newspaper article). My basic idea for an algorithm is to go through each term from the article and use elasticsearch to discover how frequent the term is in the image annotations. Then output terms from articles which are not frequent (in order to prefer names of people or places over common English words).

I don't need something very sophisticated, these keywords are used just as suggestion for user input, but I want something faster then asking N search queries (where N is number of terms in text) to elasticsearch which can be slow on large texts. Is there some robust and fast technique for keyword extraction in elasticsearch?


回答1:


You can use elastic search term aggregations for this. They can return bucketed keywords with document counts which indicate their relative frequency. Here is an example query in YML.

query:
    match:
        annotation:
            query: text of your article
aggregations:
    term_frequencies:
        terms:
            field: annotation


来源:https://stackoverflow.com/questions/22171211/fast-keyword-extraction-in-elasticsearch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!