What causes different search results for same elastic search query on two nodes

后端 未结 1 1481
盖世英雄少女心
盖世英雄少女心 2021-01-14 13:22

I have a two node elastic search setup where the same search query on the one node results in different results than on the other and I would like to find out why that is th

相关标签:
1条回答
  • 2021-01-14 14:19

    The hits mismatch is, most probably, because of an un-sync between the primary shards and the replica. This can happen if you had a node leaving the cluster (for whatever reason) but kept making changes to documents (indexing, deleting, updating).

    The scoring part is a different story, and can be explained by "Relevancy Scoring" section from this blog post:

    Elasticsearch faces an interesting dilemma when you execute a search. Your query needs to find all the relevant documents...but these documents are scattered around any number of shards in your cluster. Each shard is basically a Lucene index, which maintains its own TF and DF statistics. A shard only knows how many times "pineapple" appears within the shard, not the entire cluster.

    I would give it a try, when searching, to "DFS Query Then Fetch", meaning _search?search_type=dfs_query_then_fetch .... that should help with the accuracy of scoring.

    Also the different document count caused by document changes during the node disconnect affects the score calculation after even after deleting and rebuilding the index. This might be because changes to documents happened differently on the replica and on the primary shards, more specifically documents have been deleted. A deleted document is permanently removed from the index at segments merging time. And segments merging doesn't happen unless certain conditions are met in the underlying Lucene instance.

    A forced merging can be initiated by a POST to /_optimize?max_num_segments=1. Warning: This takes a really long time (depending on the size of the index) and will require significant IO resources and CPU and should not be run on an index where changes are being made. Documentation: Optimize, Segments Merging

    0 讨论(0)
提交回复
热议问题