Django Haystack : search for a term with and without accents

前端 未结 3 1669
日久生厌
日久生厌 2021-01-01 04:32

I\'m implementing a search system onto my django project, using django haystack. The problem is that some fields in my models have some french accents, and I would like to f

相关标签:
3条回答
  • 2021-01-01 05:00

    I find a way to index both value from the same field in my Model.

    First, write a method in your model which returns the ascii value of the fields:

    class Car(models.Model):
        name = model.CharField()
    
        def ascii_name(self):
            return strip_accents(self.name)
    

    So that in your template used to generate the index, you could do this:

    {{ object.name }}
    {{ object.ascii_name }}
    

    Then, you just have to rebuild your indexes !

    0 讨论(0)
  • 2021-01-01 05:00

    You must do something like follow:

    Cars(indexes.SearchIndex):
        name = indexes.CharField(model_attr='name')
    
        def prepare(self, obj):
            self.prepared_data = super(Cars, self).prepare(obj)
            self.prepared_data['name'] += '\n' + strip_accents(self.prepared_data['name'])
            return self.prepared_data
    

    I don't like this solution. I would like to know some way to configure my seach backend to do it for me. I use whoosh.

    0 讨论(0)
  • 2021-01-01 05:10

    Yes, you're on the right track here. Sometimes you do want to store fields multiple times, with different transformations applied.

    An example of this in my application is that I have two title fields. One for searching which gets stemmed (the process by which test ~= test ~= tester), and another for sorting which is left alone (the stemming interferes with the sort order).

    This is an analogous case.

    In my schema.xml this is handled by:

    <field name="title" type="text" indexed="true" stored="true" multiValued="false" />
    <field name="title_sort" type="string" indexed="true" stored="true" multiValued="false" />
    

    The type "string" is responsible for storing the "as-is" version of the title.

    By the way, it you're stripping accents just to make words easier to search for, this is something that might be worth looking into: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ISOLatin1AccentFilterFactory

    0 讨论(0)
提交回复
热议问题