Lucene uses a field-based, rather than document-based, index.
In order to get term counts per document:
- Iterate over documents using IndexReader.document() and isDeleted().
- In document d, iterate over fields using Document.getFields().
- For each field f, get terms using getTermFreqVector().
- Go over the term vector and sum frequencies per terms.
- The sum of term frequencies per field will give you the document's term frequency vector.