Lucene and Special Characters

后端未结

关注

 2  1172

I am using Lucene.Net 2.0 to index some fields from a database table. One of the fields is a \'Name\' field which allows special characters. When I perform a search, it does

相关标签:

2条回答

轻奢々

2021-01-11 23:49

StandardAnalyzer strips out the special characters during indexing. You can pass in a list of explicit stopwords (excluding the ones you want in).

0 讨论(0)
发布评论:

提交评论
- 加载中...
心在旅途

2021-01-12 00:13

While index, you have tokenized the field. So, your input String creates two tokens "test" and "test". For search, you are constructing query by hand ie using TermQuery instead of QueryParser, which would have tokenized the field.

For the entire match, you need to index field UN_TOKENIZED. Here, the input string is taken as a single token. The single token created "Test (Test)." In that case, your current search code will work. You have to watch the case of input string carefully to make sure if you are indexing lower case text, you have to do the same while searching.

It is generally good practice to use same analyzer during indexing and searching. You can use KeywordAnalyer to generate single token from the input string.

0 讨论(0)
发布评论:

提交评论
- 加载中...