Parse large RDF in Python

前端 未结 6 1513
故里飘歌
故里飘歌 2021-02-02 16:07

I\'d like to parse a very large (about 200MB) RDF file in python. Should I be using sax or some other library? I\'d appreciate some very basic code that I can build on, say to r

6条回答
  •  迷失自我
    2021-02-02 16:08

    A very fast library to parse RDF files is LightRdf. It could be installed via pip. Code examples can be found on the project page.

    If you want to parse triples from a gzipped RDF file, you can do this like that:

    import lightrdf
    import gzip
    
    RDF_FILENAME = 'data.rdf.gz'
    
    f = gzip.open(RDF_FILENAME, 'rb')
    doc = lightrdf.RDFDocument(f, parser=lightrdf.xml.PatternParser)
    for (s, p, o) in doc.search_triples(None, None, None)):
                print(s, p, o)
    

提交回复
热议问题