问题
I have a large Apache Jena TDB, I want to build a Lucene index using Apache Jena 2.10.2 for use with the new text search feature. I find the documentation hard to follow.
I first tried to use configuration in code, but had trouble with the dependencies. Any combination of lecene-core and solr-solrj would either result in certain 'classNotFound' errors or a 'StandardAnalyzer overrides final method tokenStream' error. Example of Code:
Dataset ds1 = DatasetFactory.createMem() ;
EntityDefinition entDef = new EntityDefinition("uri", "text", RDFS.label) ;
Directory dir = new RAMDirectory();
// Have also tried creating the index in a file
File indexDir = new File("luceneIndexes");
Directory dir = FSDirectory.open(indexDir);
// Fails on this line
Dataset ds = TextDatasetFactory.createLucene(ds1, dir, entDef) ;
I think the only solution may be to create an Text Dataset Assembler, but if anyone has advice on creating this in code I would prefer to do it that way.
回答1:
The example is exactly the one from Jena, which does work.
It looks like you have a confusion of jar versions. Have you tried using maven to resolve the dependencies? Looking at "mvn dependency:tree" shows you what versions are used.
jena-text is built for Lucene 4.3.1 or Solr 4.3.1.
See the POM from: https://repository.apache.org/content/groups/snapshots/org/apache/jena/jena-text/1.0.0-SNAPSHOT/
来源:https://stackoverflow.com/questions/17954399/creating-a-lucene-index-for-an-existing-apache-jena-tdb-to-implement-text-search