In Lucene, how can I find out if the IndexSearcher or IndexWriter is being used in another thread or not?

前端 未结 1 445
一整个雨季
一整个雨季 2021-02-11 05:36

Lucene documentation states that single instances of IndexSearcher and IndexWriter should be used for each index in the whole application, and across all threads. Also, writes t

相关标签:
1条回答
  • 2021-02-11 06:31

    Thankfully in recent versions (3.x or late 2.x) they added a method to tell you if there has been any writing after the searcher had been opened. IndexReader.isCurrent() will tell you if any changes have occurred since this reader was open or not. So you probably will create a simple wrapper class that encapsulates both reading and writing, and with some simple synchronization you can provide 1 class that manages all of this between all of the threads.

    Here is roughly what I do:

      public class ArchiveIndex {
          private IndexSearcher search;
          private AtomicInteger activeSearches = new AtomicInteger(0);
          private IndexWriter writer;
          private AtomicInteger activeWrites = new AtomicInteger(0);
    
          public List<Document> search( ... ) {
              synchronized( this ) {
                  if( search != null && !search.getIndexReader().isCurrent() && activeSearches.get() == 0 ) {
                     searcher.close();
                     searcher = null;
                  }
    
                  if( search == null ) {
                      searcher = new IndexSearcher(...);
                  }
              }
    
              activeSearches.increment();
              try {
                  // do you searching
              } finally {
                  activeSearches.decrement();
              }
              // do you searching
          }
    
    
          public void addDocuments( List<Document> docs ) {
              synchronized( this ) {
                 if( writer == null ) {
                     writer = new IndexWriter(...);
                 }
              }
              try {
                  activeWrites.incrementAndGet();
                  // do you writes here.
              } finally {
                  synchronized( this ) {
                      int writers = activeWrites.decrementAndGet();
                      if( writers == 0 ) {
                          writer.close();
                          writer = null;
                      }
                  }
              }
          }
      }
    

    So I have single class that I use for both readers and writers. Notice this class allows writing and reading at the same time, and multiple readers can search at the same time. The only sync'ing is the quick checks to see if you need to reopen the searcher/writer. I didn't synchronize on the method level which would only allow one reader/writer at a time which would be bad performance wise. If there are active searchers out there you can't drop the searcher. So if you get lots of readers coming in it just simply searches without the changes. Once it slims out the next lone searcher will reopen the dirty searcher. This might be great for lower volume sites where there will be a pause in traffic. It could still cause starvation (ie you're always reading older and older results). You could add logic to simply stop and reinitialize if the time since it was noticed dirty is older than X otherwise we lazy as it is now. That way you'll be guaranteed searches will never be older than X.

    Writers can be handled much in the same way. I tend to remember closing the writer periodically so the reader will notice its changed (commit it). I didn't do a very good job describing that, but it's much the same way of searching. If there are active writers out there you can't close the writer. If you're the last writer out the door close the writer. You get the idea.

    0 讨论(0)
提交回复
热议问题