How I can load a model in TDB TripleStore

独自空忆成欢 提交于 2019-12-01 11:33:55

问题


I have a question for you:

I would like to load a file on my Jena TDB TripleStore. My file is very big, about 80Mb and about 700000 triples RDF. When I try to load it, the execution stops working or takes a very long time.

I'm using this code that I do run on a Web Service:

        String file = "C:\\file.nt";
        String directory;
        directory = "C:\\tdb";
        Dataset dataset = TDBFactory.createDataset(directory);

        Model model = ModelFactory.createDefaultModel();

        TDBLoader.loadModel(model, file );
        dataset.addNamedModel("http://nameFile", model); 

        return model;

Sometimes I get an error of Java heap space:

Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:170)
    at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:86)
    at org.apache.jena.atlas.iterator.PeekIterator.fill(PeekIterator.java:50)
    at org.apache.jena.atlas.iterator.PeekIterator.next(PeekIterator.java:92)
    at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:99)
    at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:67)
    at org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
    at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
    at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTFactoryImpl$1.read(RDFParserRegistry.java:142)
    at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:859)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:255)
    at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:241)
    at org.apache.jena.riot.adapters.RDFReaderRIOT_Web.read(RDFReaderRIOT_Web.java:96)
    at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:241)
    at com.hp.hpl.jena.tdb.TDBLoader.loadAnything(TDBLoader.java:294)
    at com.hp.hpl.jena.tdb.TDBLoader.loadModel(TDBLoader.java:125)
    at com.hp.hpl.jena.tdb.TDBLoader.loadModel(TDBLoader.java:119)

How I can load this file in a model Jena and save it in TDB? Thanks in advance.


回答1:


You need to allocate more memory for your JVM at statup. When you have too little, the process will spend too much time performing garbage collection, and will ultimately fail.

For example, start your JVM with 4 GB of memory by:

java -Xms4G -XmxG

If you are in an IDE such as Eclipse, you can change your run configuration so that the application has additional memory as well.

Aside from that, the only change that jumps out at me is that you are using an in-memory model for the actual loading operation, when you can actually use a model backed by TDB instead. This can help to alleviate your memory problems because TDB dynamially moves its indexes to disk.

Change:

Dataset dataset = TDBFactory.createDataset(directory);
Model model = ModelFactory.createDefaultModel();
TDBLoader.loadModel(model, file );
dataset.addNamedModel("http://nameFile", model);

to this:

Dataset dataset = TDBFactory.createDataset(directory);
Model model = dataset.getNamedModel("http://nameFile");
TDBLoader.loadModel(model, file );

Now your system depends on TDB's ability to make good decisions about when to leave data in memory and when to flush it to disk.



来源:https://stackoverflow.com/questions/25850288/how-i-can-load-a-model-in-tdb-triplestore

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!