问题
I'm using Django Haystack backed by Elasticsearch for autocomplete, and I'm having trouble searching for digits in a field.
For example, I have a field called 'name' on an object type that has some values like this:
['NAME', 'NAME2', 'NAME7', 'ANOTHER NAME 8', '7342', 'SOMETHING ELSE', 'LAST ONE 7']
and I'd like to use autocomplete to search for all objects with the number '7' in the name.
I've set up my search_index with this field:
name_auto = indexes.EdgeNgramField(model_attr='name')
and I'm using a search query like so:
SearchQuerySet().autocomplete(name_auto='7')
However, this search returns no results. I believe this is because the edge-ngram tokenizer for elasticsearch defaults to "lowercase", which throws out digits entirely.
So, I found elasticstack, which allows customizing the haystack/elasticsearch backend, but I can't seem to configure the ELASTICSEARCH_INDEX_SETTINGS correctly to get the functionality I want.
The default settings look like this:
ELASTICSEARCH_INDEX_SETTINGS = {
'settings': {
"analysis": {
"analyzer": {
"synonym_analyzer" : {
"type": "custom",
"tokenizer" : "standard",
"filter" : ["synonym"]
},
"ngram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_ngram", "synonym"]
},
"edgengram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_edgengram"]
}
},
"tokenizer": {
"haystack_ngram_tokenizer": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15,
},
"haystack_edgengram_tokenizer": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15,
"side": "front"
}
},
"filter": {
"haystack_ngram": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15
},
"haystack_edgengram": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15
},
"synonym" : {
"type" : "synonym",
"ignore_case": "true",
"synonyms_path" : "synonyms.txt"
}
}
}
}
}
I've tried to alter the edgengram_analyzer block in a number of ways without success, and adding something like this
"token_chars": [ "letter", "digit" ]
to the "haystack_ngram_tokenizer" has not worked either.
Can someone help me determine how to use haystack/elasticsearch/autocomplete to search for digits? Or will I have to split the 'name' field into all possible n-grams myself and then use a standard matching search? Any help would be greatly appreciated.
Thanks a lot!
回答1:
There is a solution which helps me: http://silentsokolov.github.io/2014/09/03/django-haystack-elasticsearch-prombiemy-avtodopolnieniia.html
The document is written in Russian lang, so use Google Translation.
来源:https://stackoverflow.com/questions/25827783/using-django-haystack-autocomplete-with-elasticsearch-to-search-for-digits-numbe