wikipedia api: get parsed introduction only

岁酱吖の 提交于 2019-12-21 22:55:55

问题


Using PHP, is there a nice way to get the (parsed) introduction only from a wikipedia page?

I have to current methods:

  • The first is to call the api page and return, then call the Wiki parser on the introduction I have pulled from the first request (two requests, extracting the intro from the text isn't pretty either).
  • The second is to call the entire page parser and use xpath to retrieve every <p> tag before the contents table.

With both methods I then have to re-parse the HTML to ensure the relevant links inside the introduction link off to wikipedia.

Neither are ideal really, there must be a better way?

  • http://www.mediawiki.org/wiki/API:Parsing_wikitext
  • http://en.wikipedia.org/w/api.php

回答1:


The action=parse API module accepts a section number parameter, like this. The lead is section number 0.



来源:https://stackoverflow.com/questions/5355155/wikipedia-api-get-parsed-introduction-only

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!