How to get the terminal leaves of a Wikipedia root category

风流意气都作罢 提交于 2019-12-23 19:30:43

问题


I want to get only the leaves a wikipedia category but not sure how. I can get all the leaves by

SELECT ?subcat WHERE  {
?subcat  skos:broader* category:Buildings_and_structures_in_France_by_city .
} 

This gives me all intermediate leaves (such as Category:Buildings_and_structures_in_Antibes) but I want to get just the last/bottom leaves of the tree. Leaves that can not be split anymore. How can I do this?


回答1:


You should be able to simply filter out the values of ?subcat that are not terminal leaves:

select ?subcat where  {
  ?subcat skos:broader* category:Buildings_and_structures_in_France_by_city .
  filter not exists { [] skos:broader ?subcat }
} 

However, when I run that, I get no results. I'm not sure why. I'd guess that it's one of the idiosyncrasies of Virtuoso (the SPARQL endpoint on DBpedia), but I'm not sure. However, we can write an equivalent query that counts the number of things that each ?subcat is skos:broader than, and selects only those that are skos:broader than none:

select distinct ?subcat where {
  ?subcat  skos:broader* category:Buildings_and_structures_in_France_by_city .
  optional { ?subsubcat skos:broader ?subcat }
} 
group by ?subcat
having count(?subsubcat) = 0

SPARQL Results



来源:https://stackoverflow.com/questions/26367211/how-to-get-the-terminal-leaves-of-a-wikipedia-root-category

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!