Hi
I have lucene index that is frequently updating with new records, I have 5,000,000 records in my index and I\'m caching one of my numeric fields using FieldCache. but af
The FieldCache uses weak references to index readers as keys for their cache. (By calling IndexReader.GetCacheKey
which has been un-obsoleted.) A standard call to IndexReader.Open
with a FSDirectory
will use a pool of readers, one for every segment.
You should always pass the innermost reader to the FieldCache. Check out ReaderUtil
for some helper stuff to retrieve the individual reader a document is contained within. Document ids wont change within a segment, what they mean when describing it as unpredictable/volatile is that it will change between two index commits. Deleted documents could have been proned, segments have been merged, and such actions.
A commit needs to remove the segment from disk (merged/optimized away), which means that new readers wont have the pooled segment reader, and the garbage collection will remove it as soon as all older readers are closed.
Never, ever, call FieldCache.PurgeAllCaches()
. It's meant for testing, not production use.
Added 2011-04-03; example code using subreaders.
var directory = FSDirectory.Open(new DirectoryInfo("index"));
var reader = IndexReader.Open(directory, readOnly: true);
var documentId = 1337;
// Grab all subreaders.
var subReaders = new List();
ReaderUtil.GatherSubReaders(subReaders, reader);
// Loop through all subreaders. While subReaderId is higher than the
// maximum document id in the subreader, go to next.
var subReaderId = documentId;
var subReader = subReaders.First(sub => {
if (sub.MaxDoc() < subReaderId) {
subReaderId -= sub.MaxDoc();
return false;
}
return true;
});
var values = FieldCache_Fields.DEFAULT.GetInts(subReader, "newsdate");
var value = values[subReaderId];