I\'d like to parse a very large (about 200MB) RDF file in python. Should I be using sax or some other library? I\'d appreciate some very basic code that I can build on, say to r
For RDF processing in Python, consider using an RDF library such as RDFLib. If you also need a triplestore, more heavyweight solutions are available as well, but may not be needed here (PySesame, neo4jrdf with neo4jpy).
Before writing your own SAX parser for RDF, check out rdfxml.py:
import rdfxml
data = open('data.rdf', 'r').read()
rdfxml.parseRDF(data)