Lucene and Special Characters

后端 未结 2 1170
再見小時候
再見小時候 2021-01-11 23:27

I am using Lucene.Net 2.0 to index some fields from a database table. One of the fields is a \'Name\' field which allows special characters. When I perform a search, it does

相关标签:
2条回答
  • 2021-01-11 23:49

    StandardAnalyzer strips out the special characters during indexing. You can pass in a list of explicit stopwords (excluding the ones you want in).

    0 讨论(0)
  • 2021-01-12 00:13

    While index, you have tokenized the field. So, your input String creates two tokens "test" and "test". For search, you are constructing query by hand ie using TermQuery instead of QueryParser, which would have tokenized the field.

    For the entire match, you need to index field UN_TOKENIZED. Here, the input string is taken as a single token. The single token created "Test (Test)." In that case, your current search code will work. You have to watch the case of input string carefully to make sure if you are indexing lower case text, you have to do the same while searching.

    It is generally good practice to use same analyzer during indexing and searching. You can use KeywordAnalyer to generate single token from the input string.

    0 讨论(0)
提交回复
热议问题