Different Elasticsearch results for the same query

后端未结

关注

 3  1812

太阳男子

I\'ve setup Elasticsearch with 1 cluster á 4 nodes. Number of shards per index: 1; Number of replicas per index: 3

When I call a simple query like the following one

相关标签:

3条回答

不知归路

2021-01-11 12:59
We ran into a similar problem and it turned out to be because Elasticsearch round-robins between different shards when searching. Each shard returns a slightly different _score because of slightly different indexing due to the way ES handles deleted documents in an index. In our case this meant similar results often placed slightly lower or higher in the results order, and, when combined with pagination (using from and size in the search query) it meant the same results were turning up on two separate "pages" or not at all from page to page.

We found an Elasticsearch article on consistent scoring which explains this quite neatly and implemented a preference parameter to ensure that we always get the same scores for a particular search by querying the same shards:
```
http://localhost:9200/index_name/_search?q=term&preference=blablabla
```
We also thought about using sorting, but Elasticsearch sorts results with the same scores by an internal Lucene document ID, ensuring that results with the same scores are always returned in the same order.
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2021-01-11 13:04
This is because you don't have specified sort order and size. So every time you query you get random first 10 records as default size for result set by elasticsearch server is 10.

You can add sorting in following way with curl,
```
curl -XPOST 'localhost:9200/_search' -d '{
 "query" : {
   ...
  },
   "sort" : [
     {"price" : {"order" : "asc", "mode" : "avg"}}
   ]
}'
```
Check here for for more info specially from and size with sort which is most mostly used for pagination.

update:

Though default sort is score DESC sometime it not works when records don't have relevant _score, as per http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_sorting.html#_sorting
0 讨论(0)
发布评论:

提交评论
- 加载中...
渐次进展

2021-01-11 13:04

This question helped me, as the answer says,

One of the possible reasons could be distributed IDF, by default Elastic uses local IDF on each shard, to save some performance which will lead to different idfs across the cluster.

ES doc here

0 讨论(0)
发布评论:

提交评论
- 加载中...