How to get all terms for a Lucene field in Lucene 4

后端 未结 1 1767
无人共我
无人共我 2021-01-04 07:52

I\'m trying to update my code from Lucene 3.4 to 4.1. I figured out the changes except one. I have code which needs to iterate over all term values for one field. In Lucene

相关标签:
1条回答
  • 2021-01-04 08:45

    Please follow Lucene 4 Migration guide::

    How you obtain the enums has changed. The primary entry point is the Fields class. If you know your reader is a single segment reader, do this:

    Fields fields = reader.Fields();
    if (fields != null) {
      ...
    }
    

    If the reader might be multi-segment, you must do this:

    Fields fields = MultiFields.getFields(reader);
    if (fields != null) {
      ...
    }
    

    The fields may be null (eg if the reader has no fields).

    Note that the MultiFields approach entails a performance hit on MultiReaders, as it must merge terms/docs/positions on the fly. It's generally better to instead get the sequential readers (use oal.util.ReaderUtil) and then step through those readers yourself, if you can (this is how Lucene drives searches).

    If you pass a SegmentReader to MultiFields.fields it will simply return reader.fields(), so there is no performance hit in that case.

    Once you have a non-null Fields you can do this:

    Terms terms = fields.terms("field");
    if (terms != null) {
      ...
    }
    

    The terms may be null (eg if the field does not exist).

    Once you have a non-null terms you can get an enum like this:

    TermsEnum termsEnum = terms.iterator();
    

    The returned TermsEnum will not be null.

    You can then .next() through the TermsEnum

    0 讨论(0)
提交回复
热议问题