问题
I am writing a shell script (csh) that has to determine the lucene index version and then based on that it has to upgrade the index to next version. So, if the lucene indices are on 2.x, I have to upgrade the indices to 3.x Finally the indices need to be upgraded to 6.x.
Since upgrading indices is a sequential process(2.x->3.x->4.x->5.x->6.x), I have to know the indices version before hand so that I can set the classpath properly and upgrade.
Please help me on this.
回答1:
This is not a very clean solution but that is all I am able to find via SegmentInfos.
LuceneVersion --> Which Lucene code Version was used for this commit, written as three vInt: major, minor, bugfix
When you create your IndexReader
, it is one of concrete reader classes like - StandardDirectoryReader and this class has a toString()
method like below which is printing lucene version for each segment so you can simply call - toString()
on IndexReader
instance.
@Override
public String toString() {
final StringBuilder buffer = new StringBuilder();
buffer.append(getClass().getSimpleName());
buffer.append('(');
final String segmentsFile = segmentInfos.getSegmentsFileName();
if (segmentsFile != null) {
buffer.append(segmentsFile).append(":").append(segmentInfos.getVersion());
}
if (writer != null) {
buffer.append(":nrt");
}
for (final LeafReader r : getSequentialSubReaders()) {
buffer.append(' ');
buffer.append(r);
}
buffer.append(')');
return buffer.toString();
}
I guess, a single version for whole index doesn't make sense since an Index might have documents committed from previous version writers too.
Documents committed with older lucene version writers can be searched using latest version readers provided version distance is not far as defined by Lucene.
You might write a simple logic in Core Java using regex to extract highest lucene version as your lucene index version.
回答2:
This is a piece of code I wrote to print the index version.
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexFormatTooNewException;
import org.apache.lucene.index.IndexFormatTooOldException;
import org.apache.lucene.index.StandardDirectoryReader;
import org.apache.lucene.store.SimpleFSDirectory;
import org.junit.Test;
public class TestReindex {
public void testVersion() throws IOException{
Path path = Paths.get("<Path_to_index_files>");
try (DirectoryReader reader = StandardDirectoryReader.open(new SimpleFSDirectory(path))){
Pattern pattern = Pattern.compile("lucene.version=(.*?),");
Matcher matcher = pattern.matcher(reader.toString());
if (matcher.find()) {
System.out.println("Current version: " + matcher.group(1));
}
} catch(IndexFormatTooOldException ex) {
System.out.println("Current version: " + ex.getVersion());
System.out.println("Min Version: " + ex.getMinVersion());
System.out.println("Max Version: " + ex.getMaxVersion());
} catch (IndexFormatTooNewException ex) {
System.out.println("Current version: " + ex.getVersion());
System.out.println("Min Version: " + ex.getMinVersion());
System.out.println("Max Version: " + ex.getMaxVersion());
}
}
}
If you are trying to read an index that is too new or too old with respect to the version of Lucene being used, an exception will be thrown. The exceptions have info about the version which could be leveraged accordingly.
来源:https://stackoverflow.com/questions/44155910/how-to-determine-the-lucene-index-version