问题
I am using Elasticsearch in-built Simple analyzer https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-simple-analyzer.html, which uses Lower Case Tokenizer. and text apple 8 IS Awesome is tokenized in below format.
"apple",
"is",
"awesome"
You can clearly see, that it misses to tokenize the number 8
, hence now if I just search with 8
, my message will not appear in search.
I went through all the available analyzer available with ES but couldn't find any suitable analyzer which matches my requirement.
How can I tokenize all the words with number using custom or in-built analyzer of ES ?
回答1:
Your question is about the simple analyzer, but you mention a very old link to documentation. Try https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simple-analyzer.html
Like Val told you, you probably looking for the standard analyser. If you want to see the difference try the analysis api:
- http://localhost:9200/_analyze?analyzer=simple&text=apple%208%20IS%20Awesome
- http://localhost:9200/_analyze?analyzer=standard&text=apple%208%20IS%20Awesome
来源:https://stackoverflow.com/questions/46802733/in-built-elastic-search-analyzer-which-does-work-of-simple-analyzer-as-well-toke