问题
I want to implement a phonetic search using Lucene 6.1.0., using Soundex or any suitable algorithm for Portuguese. I found many incomplete examples over internet, teaching how to implement a custom tokenizer, analyzer, but it seems that the abstract classes used on those exapmples are not the same in the version 6.1.0. Can anyone point me out where I can find a good documentation an Lucene, not just java docs without any further documentation teaching how to put the things together?
Thanks in advance.
回答1:
The Analyzer documentation shows how to create your analyzer.
For phonetic analysis, you should look to the org.apache.lucene.analysis.phonetic package (You'll need to add "lucene-analyzers-phonetic-6.1.0.jar" to your build path, as well as Apache's "commons-codec-1.10.jar", which you can get here).
Then you can setup your analyzer something like, for instance:
Analyzer analyzer = new Analyzer() {
@Override
protected TokenStreamComponents createComponents(String fieldName) {
Tokenizer tokenizer = new StandardTokenizer();
TokenStream stream = new DoubleMetaphoneFilter(tokenizer, 6, false);
return new TokenStreamComponents(tokenizer, stream);
}
};
来源:https://stackoverflow.com/questions/38599692/how-to-implement-a-phonetic-search-using-lucene