Generating TDB Dataset from archive containing N-TRIPLES files

人走茶凉 提交于 2019-12-02 16:28:24

问题


Apologies, in advance, for a possible duplicate.

I have an archive containing 117,426 files (each in the N-TRIPLES format) that I wish to load into the default graph of a TDB dataset. Due to the large number of files, I need to be able to perform this import without manually selecting individual files for upload.

I am in Bash, with Jena and Fuseki distributions at my disposal.

If possible, I want to avoid the worst-case scenario of just writing a java application to do this. If I have to write a java application for this, what hooks exist in RIOT/TDB to perform programmatic bulk-loading?


回答1:


As a genenral comment, one way is to concatenate the N-Triples files to generate one single file.

You can load many files at once with either tdbloader or tdbloader2.

tdbloader --loc DB ... your files ...

The 117,426 may strain you OS for a single command line invocation. You can pipe the files into tdbloader (it's just like concatenating the files first)

... | tdbloader --loc DB -- -

where ... is some way to get bash to cat the files (possible from a subshell).

e.g. (you'll need to adjust to file all 117,426 files):

( for x in data*.nt
  do
    cat $x 
  done
) | tdbloader --loc DB -- -


来源:https://stackoverflow.com/questions/25730414/generating-tdb-dataset-from-archive-containing-n-triples-files

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!