Django-Haystack with Solr contains search

孤街醉人 提交于 2019-12-18 12:34:08

问题


I am using haystack within a project using solr as the backend. I want to be able to perform a contains search, similar to the Django .filter(something__contains="...")

The __startswith option does not suit our needs as it, as the name suggests, looks for words that start with the string.

I tried to use something like *keyword* but Solr does not allow the * to be used as the first character

Thanks.


回答1:


To get "contains" functionallity you can use:

<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" side="back"/>
<filter class="solr.LowerCaseFilterFactory" />

as index analyzer.

This will create ngrams for every whitespace separated word in your field. For example:

"Index this!" => x, ex, dex, ndex, index, !, s!, is!, his!, this!

As you see this will expand your index greatly but if you now enter a query like:

"nde*"

it will match "ndex" giving you a hit.

Use this approach carefully to make sure that your index doesn't get too large. If you increase minGramSize, or decrease maxGramSize it will not expand the index as mutch but reduce the "contains" functionallity. For instance setting minGramSize="3" will require that you have at least 3 characters in your contains query.




回答2:


You can achieve the same behavior without having to touch the solr schema. In your index, make your text field an EdgeNgramField instead of a CharField. Under the hood this will generate a similar schema to what lindstromhenrik suggested.




回答3:


I am using an expression like: .filter(something__startswith='...') .filter_or(name=''+s'...') as is seems solr does not like expression like '...*', but combined with or will do




回答4:


None of the answers here do a real substring search *keyword*.

They don't find the keyword that is part of a bigger string, (not a prefix or suffix).

Using EdgeNGramFilterFactory or the EdgeNgramField in the indexes can only do a "startswith" or a "endswith" type of filtering.

The solution is to use a NgramField like this:

class MyIndex(indexes.SearchIndex, indexes.Indexable):
    ...
    field_to_index= indexes.NgramField(model_attr='field_name')
    ...

This is very elegant, because you don't need to manually add anything to the schema.xml



来源:https://stackoverflow.com/questions/6337811/django-haystack-with-solr-contains-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!