问题
I have some RDF files which I want to import into a tripplestore(AllegroGraph), but at the first file I get a SAX parser error, stating there is an unrecognized character. After removing the line in question, everything is great. Then I have tried using the W3C RDF validator and Jena on the RDF with the error-line, but all I got was some warnings regarding undefined languages(absolutely nothing about the error-line). Could you please suggest a method(java if possible) to finding errors in RDF files?
Edit: The line in question is:
<gn:alternateName xml:lang="got">𐌰𐍆𐌲𐌰𐌽𐌹𐍃𐍄𐌰𐌽</gn:alternateName>
回答1:
You can use Sesame's Rio parser to do validation. There's instructions in this blogpost on how to work with Rio in general. For validation specifically, the trick is to create and attach a ParseErrorListener that receives detailed warning and errors from the parser.
However, since you mention that the problem you encounter is at the level of SAX / XML, you could also just use a generic XML validator to see what's wrong. The most likely cause (but it's hard to tell without more details) is that you have an incorrectly encoded character in there somewhere.
来源:https://stackoverflow.com/questions/8120638/rdf-reading-parsing-errors