elasticsearch word proximity

佐手、 提交于 2019-12-12 04:18:46

问题


In elasticsearch is there a way to increase the score of documents where query words are close to each other in the document? It's not only about words that are together, as this could be solved by using shingles, but about words that are in proximity where there might be another unimportant word inbetween.

Example:

document 1:

close words in documents detection

document 2:

close words in detection documents

query:

close documents

So I'd like to get a higher score for the first document and a lower for the second.

If those words were immediately next to each other, I'd use shingles and two or three words tokens. This approach, however, doesn't account for words close to each others.


回答1:


The following query is a modified form of that in the elastic docos and should meet the requirements. It uses the proximity feature in ElasticSearch known as "match phrase".

POST /my_index/my_type/_search
{
   "query": {
      "match_phrase": {
         "text": {
            "query": "close documents",
            "slop":  50 
         }
      }
   }
}

The slop parameter above controls how close the terms have to be in order for the document to be considered a match at all. Technically this is the number of moves that have to be done so it gets more complex with more words in the query, but with two terms it simplifies to distance. Beyond this, they should rank higher with closer proximity which is what we want.



来源:https://stackoverflow.com/questions/34323428/elasticsearch-word-proximity

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!