How to use the Porter Stemmber class in Lucene 3.6.2? Here is what I have:
import org.apache.lucene.analysis.PorterStemmer;
...
PorterStemmer stemmer = new PorterStemmer();
term = stemmer.stem(term);
I am being told: PorterStemmer is not public in org.apache.lucene.analysis; cannot be accessed from outside package.
Edit: I also read extensively about using Snowball, but it isn't encouraged. What is the right way to stem using Lucene in Java??
1) If you want to use PorterStemmer as part of Lucene token analysis process, use PorterStemFilter
Sample code
class MyAnalyzer extends Analyzer {
public final TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(new LowerCaseTokenizer(reader));
}
}
2) If you want to use PorterStemmer just for any other application, here is the sourcecode by author himself: PorterStemmer in Java
In Lucene later version, PorterStemmer no longer public. So
class MyAnalyzer extends Analyzer {
public final TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(new LowerCaseTokenizer(reader));
}
}
Or you can use SnowballAnalyzer Stemmer.link (SnowballAnalyzer is deprecated)
import org.tartarus.snowball.ext.PorterStemmer;
.
.
public static String applyPorterStemmer(String input) throws IOException {
PorterStemmer stemmer = new PorterStemmer();
stemmer.setCurrent(input);
stemmer.stem();
return stemmer.getCurrent();
}
来源:https://stackoverflow.com/questions/15422485/lucene-porter-stemmer-not-public