问题
In Solr you can perform an ordered proximity search using syntax
"word1 word2"~10
By ordered, I mean word1 will always come before word2 in the document. I would like to know if there is an easy way to perform an unordered proximity search, ie. word1 and word2 occur within 10 words of each other and it doesn't matter which comes first.
One way to do this would be:
"word1 word2"~10 OR "word2 word1"~10
The above will work but I'm looking for something simpler, if possible.
回答1:
Slop means how many word transpositions can occur. So "a b" is going to be different than "b a" because a different number of transpositions are allowed.
a foo b
has positions (a,1), (foo, 2), (b, 3). To match (a,1), (b,2) will require one change: (b,2) => (b,3)- However, to match (b,1), (a,2) you will need (a,2) => (a,1) and (b,1) => (b,3), for a total of three position movements
In general, if "a b"~n
matches something, then "b a"~(n+2)
will match it too.
EDIT: I guess I never gave an answer. I see two options:
- If you want a slop of n, increase it to n+2
- Manually disjunctivize your search like you suggested
I think #2 is probably better, unless your slop is very large to begin with.
回答2:
Are you sure it's already doesn't work like that? There is nothing in documentation saying that it's 'ordered':
A proximity search can be done with a sloppy phrase query. The closer together the two terms appear in the document, the higher the score will be. A sloppy phrase query specifies a maximum "slop", or the number of positions tokens need to be moved to get a match.
This example for the standard request handler will find all documents where "batman" occurs within 100 words of "movie":
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29
回答3:
Since Solr 4 it is possible with SurroundQueryParser.
E.g. to do ordered search (query where "phrase two" follows "phrase one" not further than 3 words after):
3W(phrase W one, phrase W two)
To do unordered search (query "phrase two" in proximity of 5 words of "phrase one"):
5N(phrase W one, phrase W two)
来源:https://stackoverflow.com/questions/4079388/solr-proximity-ordered-vs-unordered