问题
In elasticsearch is there a way to increase the score of documents where query words are close to each other in the document? It's not only about words that are together, as this could be solved by using shingles, but about words that are in proximity where there might be another unimportant word inbetween.
Example:
document 1:
close words in documents detection
document 2:
close words in detection documents
query:
close documents
So I'd like to get a higher score for the first document and a lower for the second.
If those words were immediately next to each other, I'd use shingles and two or three words tokens. This approach, however, doesn't account for words close to each others.
回答1:
The following query is a modified form of that in the elastic docos and should meet the requirements. It uses the proximity feature in ElasticSearch known as "match phrase".
POST /my_index/my_type/_search
{
"query": {
"match_phrase": {
"text": {
"query": "close documents",
"slop": 50
}
}
}
}
The slop parameter above controls how close the terms have to be in order for the document to be considered a match at all. Technically this is the number of moves that have to be done so it gets more complex with more words in the query, but with two terms it simplifies to distance. Beyond this, they should rank higher with closer proximity which is what we want.
来源:https://stackoverflow.com/questions/34323428/elasticsearch-word-proximity