lucene

Prevent “Too Many Clauses” on lucene query

断了今生、忘了曾经 提交于 2021-02-07 12:14:14
问题 In my tests I suddenly bumped into a Too Many Clauses exception when trying to get the hits from a boolean query that consisted of a termquery and a wildcard query. I searched around the net and on the found resources they suggest to increase the BooleanQuery.SetMaxClauseCount(). This sounds fishy to me.. To what should I up it? How can I rely that this new magic number will be sufficient for my query? How far can I increment this number before all hell breaks loose? In general I feel this is

Prevent “Too Many Clauses” on lucene query

核能气质少年 提交于 2021-02-07 12:13:26
问题 In my tests I suddenly bumped into a Too Many Clauses exception when trying to get the hits from a boolean query that consisted of a termquery and a wildcard query. I searched around the net and on the found resources they suggest to increase the BooleanQuery.SetMaxClauseCount(). This sounds fishy to me.. To what should I up it? How can I rely that this new magic number will be sufficient for my query? How far can I increment this number before all hell breaks loose? In general I feel this is

Prevent “Too Many Clauses” on lucene query

三世轮回 提交于 2021-02-07 12:11:25
问题 In my tests I suddenly bumped into a Too Many Clauses exception when trying to get the hits from a boolean query that consisted of a termquery and a wildcard query. I searched around the net and on the found resources they suggest to increase the BooleanQuery.SetMaxClauseCount(). This sounds fishy to me.. To what should I up it? How can I rely that this new magic number will be sufficient for my query? How far can I increment this number before all hell breaks loose? In general I feel this is

How to implement a phonetic search using Lucene?

痴心易碎 提交于 2021-02-07 09:14:07
问题 I want to implement a phonetic search using Lucene 6.1.0., using Soundex or any suitable algorithm for Portuguese. I found many incomplete examples over internet, teaching how to implement a custom tokenizer, analyzer, but it seems that the abstract classes used on those exapmples are not the same in the version 6.1.0. Can anyone point me out where I can find a good documentation an Lucene, not just java docs without any further documentation teaching how to put the things together? Thanks in

How to implement a phonetic search using Lucene?

我的梦境 提交于 2021-02-07 08:57:24
问题 I want to implement a phonetic search using Lucene 6.1.0., using Soundex or any suitable algorithm for Portuguese. I found many incomplete examples over internet, teaching how to implement a custom tokenizer, analyzer, but it seems that the abstract classes used on those exapmples are not the same in the version 6.1.0. Can anyone point me out where I can find a good documentation an Lucene, not just java docs without any further documentation teaching how to put the things together? Thanks in

Examples for using latest version of Lucene

夙愿已清 提交于 2021-02-05 09:26:07
问题 I'm new to Lucene and want to call it directly from my Java code in a Maven environment. I have tried for some time to find working examples that I can download and run. The latest tutorial on the official site is 2013 - Lucene 3.* https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ#LuceneFAQ-HowdoIstartusingLucene?. The current latest version in Maven is 8.5.1 . Most non-official tutorials on the web do not contain version numbers or Fully Qualified Names. Lucene appears to change

Solr schema for prefix search, howto?

醉酒当歌 提交于 2021-02-04 21:07:17
问题 I read many Questions from stackoverflow, but didn't found an answer, how to make Solr prefix search. For example I have text: "solr documentation is unreadable", and I need to find something like this: "solr docu*", "documentation unread*", "unreadable is so*", but not "un* so*", I make something like this: <fieldType name="prefix_search" class="solr.TextField"> <analyzer> <tokenizer class="solr.LowerCaseTokenizerFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"

Solr schema for prefix search, howto?

泪湿孤枕 提交于 2021-02-04 21:06:35
问题 I read many Questions from stackoverflow, but didn't found an answer, how to make Solr prefix search. For example I have text: "solr documentation is unreadable", and I need to find something like this: "solr docu*", "documentation unread*", "unreadable is so*", but not "un* so*", I make something like this: <fieldType name="prefix_search" class="solr.TextField"> <analyzer> <tokenizer class="solr.LowerCaseTokenizerFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"

Solr schema for prefix search, howto?

蹲街弑〆低调 提交于 2021-02-04 21:06:14
问题 I read many Questions from stackoverflow, but didn't found an answer, how to make Solr prefix search. For example I have text: "solr documentation is unreadable", and I need to find something like this: "solr docu*", "documentation unread*", "unreadable is so*", but not "un* so*", I make something like this: <fieldType name="prefix_search" class="solr.TextField"> <analyzer> <tokenizer class="solr.LowerCaseTokenizerFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"

Apache Lucene 8.4.1 How to get indexed fields and term list?

三世轮回 提交于 2021-02-04 19:58:22
问题 I'am new to Apache Lucene, I'm using Apache Lucene 8.4.1, I can do Lucene Indexing and Searching but don't know how to read and list index / print index using java. How to get indexed fields and term list ? . I was able to get Fileds list by using following function grabbed from Other Stackoverflow article. public static String[] getFieldNames(IndexReader reader) { List<String> fieldNames = new ArrayList<String>(); //For a simple reader over only one index, reader.leaves() should only return