问题
Recently I have been trying to learn Semantic Web. For a project I need to retrieve data from a given dbPedia link. e.g http://dbpedia.org/page/Berlin . But when retrieve data using java.net.URLConnection I get the html data. How can I get the xml from the same link ? I know that there is link in every dbpedia page to download the XML but that is not what I want to do. Thanks in advance.
回答1:
Note that the URI of the resource is actually http://dbpedia.org/resource/Berlin (with resource, not page). Ideally, you could request that URI with an Accept header of application/rdf+xml and get the RDF/XML representation of the resource. That's how the BBC publishes their data (e.g., see this answer), but DBpedia doesn't do that. Even if you request application/rdf+xml, you end up getting a redirect. You can see if you try with an HTTP client. E.g., using Advanced Rest Client in Chrome, we get this 303 redirect:
In a web browser, you get redirected to the page version by a 303 See Other response code. Ideally, you could request the resource URI with the accept header set to application/rdf+xml and get the data, but DBpedia doesn't place quite so nicely.
So, that means that the easiest way is to note that at the bottom of http://dbpedia.org/page/Berlin, there's the text with some download links:
RDF ( N-Triples N3/Turtle JSON XML )
The URL of the last link is http://dbpedia.org/data/Berlin.rdf. Thus, you can get the RDF/XML by changing page or resource to data, and appending .rdf to the end of the URL. It's not the most ReSTful solution, but it seems to be what's available.
回答2:
The good to access data from dbpedia is through Sparql
. You can use Apache Jena to run sparql
queries against http://dbpedia.org/sparql
来源:https://stackoverflow.com/questions/30279970/how-to-retrieve-xml-rdf-data-from-a-dbpedia-link-or-url