Writing Efficient SPARQL Queries (MarkLogic)

问题

I am currently running various queries and the query below takes about 4 minutes to complete. It is also running through a fairly large number of data rows(~200 Mill). However, I would like to know if there are any possible ways of improving this query? I believe I tried to optimize the query by interweaving the necessary filters before doing the next hop.

Query Used

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" at "/MarkLogic/semantics.xqy";
import module namespace thsr="http://marklogic.com/xdmp/thesaurus" 
                             at "/MarkLogic/thesaurus.xqy";

let $query := sem:sparql(
'
PREFIX xs: <http://www.w3.org/2001/XMLSchema#>
PREFIX cts: <http://marklogic.com/cts#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema/>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX db: <http://dbpedia.org/resource/>
PREFIX onto: <http://dbpedia.org/ontology/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns>
PREFIX xdmp: <http://marklogic.com/xdmp#>

SELECT *
WHERE{
?people </posted> ?question .
FILTER (cts:contains(fn:doc(?people), 
  cts:and-query((
    cts:path-range-query("/people_data/dob",">=", xs:dateTime("1980-08-09T10:34:52")),
    cts:path-range-query("/people_data/dob","<=", xs:dateTime("2000-08-09T10:34:52")) 
    ))
)) .
?question </has_tags> ?tags.
FILTER(regex(?tags, ".*doc.*")) .
?question </has_comment> ?comment .
FILTER(cts:contains(fn:doc(?comment), cts:element-word-query(xs:QName("text"), "got"))) .
?commented_by_person </commented> ?comment .
?commented_by_person </dislikes> ?person_disliked .
?person_disliked </from> ?school 
}',
(),
(),
()
)

return (xdmp:elapsed-time())

Any advise or tip and tricks will be greatly appreciated.

来源：https://stackoverflow.com/questions/56074076/writing-efficient-sparql-queries-marklogic

标签

marklogic