Solr proximity ordered vs unordered

纵饮孤独 提交于 2019-12-03 16:41:12

Slop means how many word transpositions can occur. So "a b" is going to be different than "b a" because a different number of transpositions are allowed.

  • a foo b has positions (a,1), (foo, 2), (b, 3). To match (a,1), (b,2) will require one change: (b,2) => (b,3)
  • However, to match (b,1), (a,2) you will need (a,2) => (a,1) and (b,1) => (b,3), for a total of three position movements

In general, if "a b"~n matches something, then "b a"~(n+2) will match it too.

EDIT: I guess I never gave an answer. I see two options:

  1. If you want a slop of n, increase it to n+2
  2. Manually disjunctivize your search like you suggested

I think #2 is probably better, unless your slop is very large to begin with.

Are you sure it's already doesn't work like that? There is nothing in documentation saying that it's 'ordered':

A proximity search can be done with a sloppy phrase query. The closer together the two terms appear in the document, the higher the score will be. A sloppy phrase query specifies a maximum "slop", or the number of positions tokens need to be moved to get a match.

This example for the standard request handler will find all documents where "batman" occurs within 100 words of "movie":

http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29

Since Solr 4 it is possible with SurroundQueryParser.

E.g. to do ordered search (query where "phrase two" follows "phrase one" not further than 3 words after):

3W(phrase W one, phrase W two)

To do unordered search (query "phrase two" in proximity of 5 words of "phrase one"):

5N(phrase W one, phrase W two)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!