How to get all links and their Wikidata IDs for a Wikipedia page?

纵饮孤独 提交于 2019-12-04 15:12:47

To get all Wikipedia page links you have to use Wikipedia API, and to get all Wikidata item properties you need Wikidata API, so it is not possible to create one query with two requests to both APIs. But! The first part of your question is already possible. And about the second... you didn't say anything for this what information you need from Wikidata.

You can get Wikidata IDs and a lot of other information for all Wikipedia page links, like coordinates, refs, internal and external links, images, text content, contributors, history, page rights, categories, templates etc... To do this we can use only Wikipedia API because our entry point is the Wikipedia page, plus generator feature of the API.

For example, this is how to get Wikidata ID, short intro text and the main image for first 20 interwiki links on Dolphin Wikipedia page:

https://en.wikipedia.org/w/api.php?action=query&generator=links&format=xml&redirects=1&titles=Dolphin&prop=pageprops|extracts|pageimages&gpllimit=20&ppprop=wikibase_item&exintro=1&exlimit=20&piprop=name&pilimit=20

Main query parameters:

  • action=query&format=xml&redirects=1&titles=Dolphin
  • generator=links - to get all page links (works together with gpllimit=20)
  • prop=pageprops|extracts|pageimages - what to get from the links

Properties:

  • pageprops - to get Wikidata ID (works with ppprop=wikibase_item)
  • extracts - to get first text lines from that page (works with exintro=1 and exlimit=20)
  • pageimages - to get main image (works with piprop=name and pilimit=20)

In the same way you can get and another information listed here in parameter prop.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!