I\'m using a snowball analyzer to stem the titles of multiple documents. Everything works well, but their are some quirks.
Example:
A search for \"valv\", \
I don't think that there is an easy(and correct) way to do this.
My solution would be writing a custom query parser that finds the longest string common to the terms in the index and to your search criteria.
class MyQueryParser : Lucene.Net.QueryParsers.QueryParser
{
IndexReader _reader;
Analyzer _analyzer;
public MyQueryParser(string field, Analyzer analyzer,IndexReader indexReader) : base(field, analyzer)
{
_analyzer = analyzer;
_reader = indexReader;
}
public override Query GetPrefixQuery(string field, string termStr)
{
for(string longestStr = termStr; longestStr.Length>2; longestStr = longestStr.Substring(0,longestStr.Length-1))
{
TermEnum te = _reader.Terms(new Term(field, longestStr));
Term term = te.Term();
te.Close();
if (term != null && term.Field() == field && term.Text().StartsWith(longestStr))
{
return base.GetPrefixQuery(field, longestStr);
}
}
return base.GetPrefixQuery(field, termStr);
}
}
you can also try to call your analyzer in GetPrefixQuery
which is not called for PrefixQuery
s
TokenStream ts = _analyzer.TokenStream(field, new StringReader(termStr));
Lucene.Net.Analysis.Token token = ts.Next();
var termstring = token.TermText();
ts.Close();
return base.GetPrefixQuery(field, termstring);
But, be aware that you can always find a case where the returned results are not correct. This is why Lucene doesn't take analyzers into account when using wildcards.