问题
I have this request that results some DBpedia resources and their abstracts. How can I filter the results to get just the resources whose abstracts contain at least a certain number of occurrences of a particular word?
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbpedia-owl:<http://www.dbpedial.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
select distinct ?resource ?url ?resume where {
?resource rdfs:label ?Nom.
?resource foaf:isPrimaryTopicOf ?url.
?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "apple".
}
This is the new request without Bind function:
select (strlen(replace(replace(Lcase(?resume), 'Jobs', '_'),'[^_]', '')) as ?nbr ) ?resource ?url
where {
?resource rdfs:label ?Nom.
?resource foaf:isPrimaryTopicOf ?url.
?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "Apple".}
GROUP BY ?Nom
Having(?nbr >= 1)
回答1:
This won't be absolutely perfect, but it should work relatively well for what you're trying to accomplish. You can use replace to replace all the instances of the word you want to count with some single character (e.g., '_'). Then you can use replace again to replace everything except that character with the empty string. Then, you have a string like '______', where the length is the number of times that the word appeared in the string. For instance, here's a query that counts 'the' in the abstract, and keeps only those where 'the' appears at least five times.
select ?x ?nThe {
values ?x { dbr:Horse dbr:Cat dbr:Dog }
?x dbo:abstract ?abs
filter langMatches(lang(?abs),'en')
bind(strlen(replace(replace(?abs, '\\sthe\\s', '_'),'[^_]', '')) as ?nThe)
filter (?nThe >= 5)
}
SPARQL results
回答2:
it's ok, I find another form for my request:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
select distinct ?Nom ?resource ?url
where {
?resource rdfs:label ?Nom.
?resource foaf:isPrimaryTopicOf ?url.
?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "Apple".
FIlTER regex(?resume,"Jobs")}
Thinks to all those who have tried to help me
来源:https://stackoverflow.com/questions/33811501/select-dbpedia-resource-with-at-least-n-occurrences-of-seleted-word-in-abstract