wikipedia-api | 易学教程

How to get the result of “all pages with prefix” using Wikipedia api?

阅读更多关于 How to get the result of “all pages with prefix” using Wikipedia api?

问题 I wish to use Wikipedia api to extract the result of this page: http://en.wikipedia.org/wiki/Special:PrefixIndex When searching "something" on it, for example this: http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&prefix=tal&namespace=4 Then, I would like to access each of the resulting pages and extract their information. What api call might I use? 回答1: You can use list=allpages and specify apprefix . For example: http://en.wikipedia.org/w/api.php?format=xml&action=query&list

How to get coordinates from a Wikipedia page through API?

阅读更多关于 How to get coordinates from a Wikipedia page through API?

问题 I want to get the coordinates of a Wikipedia page through their API. I want to put the page title as 'titles' parameter. I have searched SO for a solution but seems they are scrapping the page then extracting. Is it possible through their API? 回答1: You need to use Wikipedia API. For your example with Kinkaku-ji the query will be: https://en.wikipedia.org/w/api.php?action=query&prop=coordinates&titles=Kinkaku-ji For more than one title use pipe to separate them: titles=Kinkaku-ji|Paris|... 来源：

Wikimedia API get generator metadata

阅读更多关于 Wikimedia API get generator metadata

问题 I want to get pages from Wikimedia Commons and it seems, that I have still not understand the usage of the Wikimedia API. I use the following query https://commons.wikimedia.org/w/api.php?action=query&prop=imageinfo&format=json&iiprop=url|size|mime|mediatype|extmetadata&iiurlwidth=150&generator=search&gsrsearch=transformation&gsrnamespace=6&gsrlimit=9&gsroffset=0&gsrinfo=totalhits See in API Sandbox Which works great, except that I don't get the grsinfo / generator metadata. But I need the

Fetch the description from wikipedia from an article

阅读更多关于 Fetch the description from wikipedia from an article

问题 I am trying to make a API call to wikipedia through: http://en.wikipedia.org/w/api.php?action=parse&page=Petunia&format=xml, but the xml is full with html and css tags. Is there a way to fetch only plain text without tags? Thanks! *Edit 1: $json = json_decode(file_get_contents('http://en.wikipedia.org/w/api.php?action=parse&page=Petunia&format=json')); $txt = strip_tags($json->text); var_dump($json); Null displayed. 回答1: Question was partially answered here $url = 'http://en.wikipedia.org/w

Wikipedia API to get articles belonging to a category

阅读更多关于 Wikipedia API to get articles belonging to a category

问题 I would like to get a number of pages belonging to a specific category, say sports and politics. I would like to extract various sections from the pages, such as abstract, title, etc. Is there an API to do that? If not, are there any Wikipedia dumps organized by categories? Thanks 回答1: You're looking for the categorymembers api. Notice that you will only get pages directly in that single category, no subcategories; and there are no intersection operators. You probably will want to use that

Simplexml_load_file empty array

阅读更多关于 Simplexml_load_file empty array

问题 I try to load this document: $url = "http://en.wikipedia.org/w/api.php?action=query&titles=Electrophoresis&prop=langlinks&lllimit=500"; When I run it in browser, everything is fine. When I do this: ini_set('user_agent', 'XX123456789 (localhost; myemailaddress)'); //sets info for authentication $content = file_get_content($url); var_dump($content); It return the same xml document as my browser show. However when I try to $content_arrays = Simplexml_load_file($content); echo '<pre>', print_r(

fetch() with the Wikipedia API results in “TypeError: NetworkError when attempting to fetch resource.”

阅读更多关于 fetch() with the Wikipedia API results in “TypeError: NetworkError when attempting to fetch resource.”

问题 fetch('https://en.wikipedia.org/w/api.php?action=query&titles=Main%20Page&prop=revisions&rvprop=content&format=json') .then( function(response) { if (response.status !== 200) { console.log('Looks like there was a problem. Status Code: ' + response.status); return; } // Examine the text in the response response.json().then(function(data) { console.log(data); }); } ) .catch(function(err) { document.write('Fetch Error :-S', err); }); The fetch address I'm using is listed here: https://www

How to parse the attributes value inside {{}} (curly braces) inside a infobox

阅读更多关于 How to parse the attributes value inside {{}} (curly braces) inside a infobox

问题 Within Infobox at wikipedia some attributes values are also inside curly braces {{}}.. Some time they have lins also.. I need values inside the braces, which is displayed on wikipedia web page. I read these are templates also.. Can anyone give me some link or guide me how do I deal with it? 回答1: Double-curly-braces {{}} define a call to some kind of magic word, variable, parser function, or template.. Help can be found on MediaWiki.org/.../Manual:Magic_words. The little lines that look like |

How to get Titles from a Wikipedia Page

阅读更多关于 How to get Titles from a Wikipedia Page

问题 Is there a direct API call where I can get titles from a wikipedia page. For e.g. from http://en.wikipedia.org/wiki/Chicago, I want to retrieve the following: 1 History 1.1 Rapid growth and development 1.2 20th and 21st centuries 2 Geography 2.1 Topography 2.2 Climate 3 Cityscape 3.1 Architecture so on ----------- I have looked at http://www.mediawiki.org/wiki/API:Lists/All, but couldn't find an action which gives me above list from a wiki page. 回答1: What you want is not a list of pages, so

How would you handle different formats of dates?

阅读更多关于 How would you handle different formats of dates?

问题 I have different types of dates formatting like: 27 - 28 August 663 CE 22 August 1945 19 May May 4 1945 – August 22 1945 5/4/1945 2-7-1232 03-4-1020 1/3/1 (year 1) 09/08/0 (year 0) Note they are all different formats, different order, some have 2 months, some only one, I tried to use moment js with no results, I also tried to use date js yet, no luck. I tried to do some splitting: dates.push({ Time : [] }); function doSelect(text) { return $wikiDOM.find(".infobox th").filter(function() {