Query dbpedia sparql endpoint using dotnetRDF - RDFParseException

ぃ、小莉子 提交于 2019-12-06 01:06:16

DBPedia has known issues with encoding and it may be simply that DBPedia is producing dud data.

What you can try to do to debug this further in dotNetRDF is to wrap the code that invokes the query with the following:

try
{
   Options.HttpDebugging = true;
   Options.HttpFullDebugging = true;

  //Try your query here
}
finally
{
   Options.HttpDebugging = false;
   Options.HttpFullDebugging = false;
}

This will cause parsing to fail (with a different error) but it will dump the raw HTTP response to the console for debugging. If you can edit your question to include the content from the lines around line 456 of the dump then people may be able to provide your with more help.

Edit

So as suspected the problem is indeed with DBPedia producing dud data, not in dotNetRDF itself.

When I downloaded the file you mentioned in Turtle format and tried to parse it I got the same error message and it pertains to the following line:

ns6:Avalon_–_Spiel_um_dein_Leben ,

While at first glance that may look valid (since a simple hyphen - is allowed in Prefixed Names) the problem is that it is not a hyphen it is in fact character code 8211 (of hex 2013 as AndyS mentions) and this is not in the acceptable range of prefix name characters.

Btw I confirmed this with Jena's Turtle parser as well just to make sure it really wasn't a dotNetRDF issue.

So basically the DBPedia data is broken, you can try forcing it to send you back RDF/XML or NTriples by setting the accept headers appropriately but there is no guarantee that the data won't come back bad in those formats as well. I would suggest that you contact the DBPedia guys to report this as a bug - dbpedia-discussion@lists.sf.net

Seeing line 456 would be useful. Try making the request with wget (it encodes URLs, curl doesn't, making it easier to use from the command line).

Unicode codepoint 8211 is EN DASH (hex 2013).

LIMIT in CONSTRUCT is the number of row from the graph pattern and not the CONSTRUCT template. You may get more triples that is covered by the SELECT ... LIMIT. Try a larger LIMIT in the SELECT and see if it breaks.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!