Get search word Hits ( number of occurences) per document in Lucene

后端 未结 2 929
执笔经年
执笔经年 2021-01-23 23:18

Can any one suggest me the best way to get Hits( no of occurrences ) of a word per document in Lucene?..

2条回答
  •  -上瘾入骨i
    2021-01-24 00:01

    Lucene uses a field-based, rather than document-based, index. In order to get term counts per document:

    1. Iterate over documents using IndexReader.document() and isDeleted().
    2. In document d, iterate over fields using Document.getFields().
    3. For each field f, get terms using getTermFreqVector().
    4. Go over the term vector and sum frequencies per terms.
    5. The sum of term frequencies per field will give you the document's term frequency vector.

提交回复
热议问题