Adding a document to the index in SOLR: Document contains at least one immense term

前端未结

关注

 2  477

I am adding (by a Java program) for indexing, a document in SOLR index, but after add(inputDoc) method there is an exception. The log in solr web interface cont

相关标签:

2条回答

半阙折子戏

2021-01-11 12:22
I had the same problem as yours, finally I solved my problem. Please check the type of your "text" field, I suspect it must be "strings".

You can find it in the managed-schema of the core:
```
<field name="text" type="strings"/>
```
Or you can go to Solr Admin, access: http://localhost:8983/solr/CORE_NAME/schema/fieldtypes?wt=json and then search for "text", if it is something like the follow, you know you defined your "text" field as strings type:
```
  {
  "name":"strings",
  "class":"solr.StrField",
  "multiValued":true,
  "sortMissingLast":true,
  "fields":["text"],
  "dynamicFields":["*_ss"]},
```
Then my solution works for you, you can change the type from "strings" to "text_general" in managed-schema. (make sure type of "text" in schema.xml is also "text_general")
```
   <field name="text" type="text_general">
```
This will solve your problem. strings is string field, but text_general is text field.
0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2021-01-11 12:33
You probably met what is described in LUCENE-5472 [1]. There, Lucene throws an error if a term is too long. You could:
- use (in index analyzer), a LengthFilterFactory [2] in order to filter out those tokens that don't fall withing a requested length range
- use (in index analyzer), a TruncateTokenFilterFactory [3] for fixing the max length of indexed tokens
- use a custom UpdateRequestProcessor, but this actually depends on your context
[1] https://issues.apache.org/jira/browse/LUCENE-5472
[2] https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LengthFilterFactory
[3] https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.TruncateTokenFilterFactory [4] https://wiki.apache.org/solr/UpdateRequestProcessor
0 讨论(0)
发布评论:

提交评论
- 加载中...