When using the ngram filter with elasticsearch so that when I search for something like \"test\" I return a document \"latest\", \"tests\" and \"test\". Is there a way to make i
You need multifield and multimatch query.
I have similar issue. I needed to search by first name, so if I put search term 'And', I get first 'Andy', and than 'Mandy'. With just nGram, I was not able to achieve that.
I added one more analyzer that uses front edgeNGram (code below is for Spring Data Elasticsearch, but you can get the idea).
setting.put("analysis.analyzer.word_parts.type", "custom");
setting.put("analysis.analyzer.word_parts.tokenizer", "ngram_tokenizer");
setting.put("analysis.analyzer.word_parts.filter", "lowercase");
setting.put("analysis.analyzer.type_ahead.type", "custom");
setting.put("analysis.analyzer.type_ahead.tokenizer", "edge_ngram_tokenizer");
setting.put("analysis.analyzer.type_ahead.filter", "lowercase");
setting.put("analysis.tokenizer.ngram_tokenizer.type", "nGram");
setting.put("analysis.tokenizer.ngram_tokenizer.min_gram", "3");
setting.put("analysis.tokenizer.ngram_tokenizer.max_gram", "50");
setting.put("analysis.tokenizer.ngram_tokenizer.token_chars", new String[] { "letter", "digit" });
setting.put("analysis.tokenizer.edge_ngram_tokenizer.type", "edgeNGram");
setting.put("analysis.tokenizer.edge_ngram_tokenizer.min_gram", "2");
setting.put("analysis.tokenizer.edge_ngram_tokenizer.max_gram", "20");
I mapped the required fields as multiple field:
@MultiField(mainField = @Field(type = FieldType.String, indexAnalyzer = "word_parts", searchAnalyzer = "standard"),
otherFields = @NestedField(dotSuffix = "autoComplete", type = FieldType.String, searchAnalyzer = "standard", indexAnalyzer = "type_ahead"))
private String firstName;
For the query I am using multimatch were I first specify 'firstName.autoComplete', and than just 'firstName'
QueryBuilders.multiMatchQuery(searchTerm, new String[]{"firstName.autoComplete", "firstName"})
This seems to be working properly.
In your case, if you need exact match, perhaps instead of 'edgeNGram' you could use just 'standard'.