I\'m implementing a search system onto my django project, using django haystack. The problem is that some fields in my models have some french accents, and I would like to f
I find a way to index both value from the same field in my Model.
First, write a method in your model which returns the ascii value of the fields:
class Car(models.Model):
name = model.CharField()
def ascii_name(self):
return strip_accents(self.name)
So that in your template used to generate the index, you could do this:
{{ object.name }}
{{ object.ascii_name }}
Then, you just have to rebuild your indexes !
You must do something like follow:
Cars(indexes.SearchIndex):
name = indexes.CharField(model_attr='name')
def prepare(self, obj):
self.prepared_data = super(Cars, self).prepare(obj)
self.prepared_data['name'] += '\n' + strip_accents(self.prepared_data['name'])
return self.prepared_data
I don't like this solution. I would like to know some way to configure my seach backend to do it for me. I use whoosh.
Yes, you're on the right track here. Sometimes you do want to store fields multiple times, with different transformations applied.
An example of this in my application is that I have two title
fields. One for searching which gets stemmed (the process by which test ~= test ~= tester), and another for sorting which is left alone (the stemming interferes with the sort order).
This is an analogous case.
In my schema.xml this is handled by:
<field name="title" type="text" indexed="true" stored="true" multiValued="false" />
<field name="title_sort" type="string" indexed="true" stored="true" multiValued="false" />
The type "string" is responsible for storing the "as-is" version of the title.
By the way, it you're stripping accents just to make words easier to search for, this is something that might be worth looking into: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ISOLatin1AccentFilterFactory