Clasterized scoring in ElasticSearch

≡放荡痞女 提交于 2019-12-13 03:48:14

问题


Let's say I got some complex query in ElasticSearch 6.2 and it can return the next hits:

"hits" : [
  {
    ...
    "_score" : 100,
    "_source" : { ... }
    ...
  },
  {
    ...
    "_score" : 99,
    "_source" : { ... }
    ...
  },
  {
    ...
    "_score" : 50,
    "_source" : { ... }
    ...
  },
  {
    ...
    "_score" : 49,
    "_source" : { ... }
    ...
  }
]

Or the same query can return:

"hits" : [
  {
    ...
    "_score" : 10,
    "_source" : { ... }
    ...
  },
  {
    ...
    "_score" : 9.9,
    "_source" : { ... }
    ...
  },
  {
    ...
    "_score" : 2,
    "_source" : { ... }
    ...
  },
  {
    ...
    "_score" : 1,
    "_source" : { ... }
    ...
  }
]

As you see the distribution of score is uneven and there are group of items with close scores. I need to include to result set on items from top group. I can't provide the reasonable min_score, because for different query parameters the absolute score values can differ very much. Is there any way to make Elastic return the top scored group regardless of actual absolute values? Thank you in advance.


回答1:


As far as I know Elasticsearch does not provide a way to cut off some hits based on the relative score. In order to do it you should know in advance the maximum score which can be very different depending on the search query itself and on the current state of the index. One not very elegant way to achieve this is to get a maximum score from the first request that limits size of the results by one and then use relative min_score in the second request to filter out the results. On the other hand the same can be achieved by filtering results of the regular query manually on the client side.



来源:https://stackoverflow.com/questions/52179259/clasterized-scoring-in-elasticsearch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!