问题
I want to parse some text using Lucene query parser to carry out basic text preprocessing on the texts. I used following lines of code:
Analyzer analyzer = new EnglishAnalyzer();
QueryParser parser = new QueryParser("", analyzer);
String text = "...";
String ret = parser.parse(QueryParser.escape(text)).toString();
But, I am getting an error:
Exception in thread "main" org.apache.lucene.queryparser.classic.ParseException: Cannot parse '': Encountered "<EOF>" at line 1, column 0.
回答1:
Using Query.escape()
removes the special characters. However it doesn't remove
AND, NOT, OR
which are keywords used in lucene search.
There are two ways to deal with it :
- Replace AND, NOT, OR in the query string.
- Convert the query string to lower case.
Converting to lower case resolves the issue as only the capitalized AND, NOT, OR are keywords. They are treated as a regular word in lower case.
回答2:
for those who face this problem, I realized that my parser throw exception for the word "NOT", even after escaped. I had to manually replace it by other word.
来源:https://stackoverflow.com/questions/39276972/lucene-error-while-parsing-query-cannot-parse-encountered-eof-at-line-1