sparql queries with round brackets throw exception

后端 未结 1 1903
情深已故
情深已故 2021-01-18 08:29

I am trying to extract labels from DBpedia for some persons. I am partially successful now, but I got stuck in the following problem. The following code works.



        
相关标签:
1条回答
  • 2021-01-18 09:10

    It took some copying and pasting to see what exactly was going on. I'd suggest that you put newlines in your query for easier readability. The query you're using is:

    PREFIX dbres: <http://dbpedia.org/resource/>
    SELECT * WHERE
    {
      dbres:??? <http://www.w3.org/2000/01/rdf-schema#label> ?o 
      FILTER (langMatches(lang(?o),"en"))
    }
    

    where ??? is being replaced by the contents of the string entity. You're doing absolutely no input validation here to ensure that the value of entity will be legal to paste in. Based on your question, it sounds like entity contains William_H._Miller_(writer), so you're getting the query:

    PREFIX dbres: <http://dbpedia.org/resource/>
    SELECT * WHERE
    {
      dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o 
      FILTER (langMatches(lang(?o),"en"))
    }
    

    You can paste that into the public DBpedia endpoint, and you'll get a similar parse error message:

    Virtuoso 37000 Error SP030: SPARQL compiler, line 6: syntax error at 'writer' before ')'
    
    SPARQL query:
    define sql:big-data-const 0 
    #output-format:text/html
    define sql:signal-void-variables 1 define input:default-graph-uri <http://dbpedia.org> PREFIX dbres: <http://dbpedia.org/resource/>
    SELECT * WHERE
    {
      dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o 
      FILTER (langMatches(lang(?o),"en"))
    }
    

    Better than hitting DBpedia's endpoint with bad queries, you can also use the SPARQL query validator, which reports for that query:

    Syntax error: Lexical error at line 4, column 34. Encountered: ")" (41), after : "writer"

    In Jena, you can use the ParameterizedSparqlString to avoid these sorts of issues. Here's your example, reworked to use a parameterized string:

    import com.hp.hpl.jena.query.ParameterizedSparqlString;
    
    public class PSSExample {
        public static void main( String[] args ) {
            // Create a parameterized SPARQL string for the particular query, and add the 
            // dbres prefix to it, for later use.
            final ParameterizedSparqlString queryString = new ParameterizedSparqlString(
                    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n" +
                    "SELECT * WHERE\n" +
                    "{\n" +
                    "  ?entity rdfs:label ?o\n" +
                    "  FILTER (langMatches(lang(?o),\"en\"))\n" +
                    "}\n"
                    ) {{
                setNsPrefix( "dbres", "http://dbpedia.org/resource/" );
            }};
    
            // Entity is the same. 
            final String entity = "William_H._Miller_(writer)";
    
            // Now retrieve the URI for dbres, concatentate it with entity, and use
            // it as the value of ?entity in the query.
            queryString.setIri( "?entity", queryString.getNsPrefixURI( "dbres" )+entity );
    
            // Show the query.
            System.out.println( queryString.toString() );
        }
    }
    

    The output is:

    PREFIX dbres: <http://dbpedia.org/resource/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT * WHERE
    {
      <http://dbpedia.org/resource/William_H._Miller_(writer)> rdfs:label ?o
      FILTER (langMatches(lang(?o),"en"))
    }
    

    You can run this query at the public endpoint and get the expected results. Notice that if you use an entity that doesn't need special escaping, e.g.,

    final String entity = "George_Washington";
    

    then the query output will use the prefixed form:

    PREFIX dbres: <http://dbpedia.org/resource/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT * WHERE
    {
      dbres:George_Washington rdfs:label ?o
      FILTER (langMatches(lang(?o),"en"))
    }
    

    This is very convenient, because you don't have to do any checking about whether your suffix, i.e., entity, has any characters that need to be escaped; Jena takes care of that for you.

    0 讨论(0)
提交回复
热议问题