I know its possible to get the top terms within a Lucene Index, but is there a way to get the top terms based on a subset of a Lucene index?
I.e. What are the top terms
Counting up the TermVectors will work, but will be slow if there are a lot of documents to iterate. Also note if you mean docFreq by top terms, then don't use the count in the TermFreqVector just count the terms as binary.
Alternatively, you could iterate the terms like facet counts. Use a cached filter for every term; their BitSets can be used for a fast intersection count.