wikipedia | 易学教程

How to get the full image comments using the wikipedia api?

阅读更多关于 How to get the full image comments using the wikipedia api?

问题 I'm grabbing some image metadata from the wikipedia api but noticed the text could be truncated. On this page: http://en.wikipedia.org/w/api.php?action=query&prop=imageinfo&iiprop=comment&format=xml&titles=File:BrolinFoxFassbenderJonahHexJuly09.jpg i only see: {{OTRS pending|year=2009|month=August|day=16}} {{Information |Description={{en|Josh Brolin, Megan Fox, and Michael Fassbender promoting the 2010 film ''Jonah Hex'' at San Diego Comic-Con.}} |Source= http://www.flickr.com/photos

Antlrworks - extraneous input

阅读更多关于 Antlrworks - extraneous input

问题 I am new in this stuff, and for that reason I will need your help.. I am trying to parse the Wikipedia Dump, and my first step is to map each rule defined by them into ANTLR, unfortunally I got my first barrier: line 1:8 extraneous input ''''' expecting '\'\'' I am not understanding what is going on, please lend me your help. My code: grammar Test; options { language = Java; } parse : term+ EOF ; term : IDENT | '[[' term ']]' | '\'\'' term '\'\'' | '\'\'\'' term '\'\'\'' ; IDENT : ('a'..'z' |

Retrieve first paragraph of Wikipedia article

阅读更多关于 Retrieve first paragraph of Wikipedia article

问题 I've been trying to understand the MediaWiki documentation for the past 2 days and I can't figure out how to retrieve the first paragraph of a Wikipedia article through the MediaWiki API. Could someone point me to the right direction? I am about to appeal to file_get_contents, but I'm confident there's a "cleaner" solution. 回答1: Don't try to use the raw API, instead use a client wrapper. Here's a long list to choose from, all for PHP: http://en.wikipedia.org/wiki/Wikipedia:PHP_bot_framework

How to get internal link from latest revision of a wikipedia page?

阅读更多关于 How to get internal link from latest revision of a wikipedia page?

问题 I'm trying to extract internal links from wikipedia pages. This is the query I'm using /w/api.php?action=query&prop=links&format=xml&plnamespace=0&pllimit=max&titles=pageTitle However, the result does not reflect what's on the wiki page. Take for example a random article here. There are only a dozen of links on this page. However, when I make the query, /w/api.php?action=query&prop=links&format=xml&plnamespace=0&pllimit=max&titles=Von_Mises%E2%80%93Fisher_distribution I got back 187 links. I

Getting Wikipedia IDs in MQL

阅读更多关于 Getting Wikipedia IDs in MQL

问题 Freebase WEX dumps contain a wpid column corresponding to the page_id from the source MediaWiki database in the freebase_wpid table. This table provides a mapping between Wikipedia numeric article/redirect IDs and Freebase GUIDs (Global Unique IDs). guid use as foreign keys is deprecated by mid for lots of good reasons, but that doesn't change the fact that guids are still used at a system level so I'm going to call mid an accessor from here on. Using the mid accessor is flexible in MQL. One

“Partial match” table (aka “failure function”) in KMP (on wikipedia)

阅读更多关于 “Partial match” table (aka “failure function”) in KMP (on wikipedia)

问题 I'm reading the KMP algorithm on wikipedia. There is one line of code in the "Description of pseudocode for the table-building algorithm" section that confuses me: let cnd ← T[cnd] It has a comment: (second case: it doesn't, but we can fall back) , I know we can fall back, but why T[cnd], is there a reason? Because it really confuses me. Here is the complete pseudocode fot the table-building algorithm: algorithm kmp_table: input: an array of characters, W (the word to be analyzed) an array of

Which wiki markup parser does Wikipedia use?

阅读更多关于 Which wiki markup parser does Wikipedia use?

问题 None of these parsers are used by Wikipedia; None of them handle the wiki code correctly. Does anyone know what parser Wikipedia uses? 回答1: Wikipedia uses MediaWiki, which has its own parser. 回答2: Wikipedia runs on the Mediawiki engine, originally written precisely to use for Wikipedia. They implement their own parser. A more thorough description of the parser is available in the manual. 来源： https://stackoverflow.com/questions/5956883/which-wiki-markup-parser-does-wikipedia-use

iframe wikipedia article without the wrapper

阅读更多关于 iframe wikipedia article without the wrapper

问题 I want to embed a wikipedia article into a page but I don't want all the wrapper (navigation, etc.) that sits around the articles. I saw it done here: http://www.dayah.com/periodic/. Click on an element and the iframe is displayed and links to the article only (no wrapper). So how'd they do that? Seems like JavaScript handles showing the iframe and constructing the href but after browsing the pages javascript (http://www.dayah.com/periodic/Script/interactivity.js) I still can't figure out how

Download images with MediaWiki API?

阅读更多关于 Download images with MediaWiki API?

问题 Is it possible to download images from Wikipedia with MediaWiki API? 回答1: No, it is not possible to get the images via the API. Images in a MediaWiki are stored just in folders, not in a database and are not delivered dynamically (more information on that in the Manual:Image administration). However, you can retrieve the URLs of those image files via the API. For example see the API:Allimages list or imageinfo property querymodules. Then you can download the files from those URLs with your

Retrieve a list of all Wikipedia languages programmatically

阅读更多关于 Retrieve a list of all Wikipedia languages programmatically

问题 I need to retrieve a list of all existing languages for a certain wiki project. For example, all Wikivoyage or all Wikipedia languages, just like on their landing pages. I prefer to do this via MediaWiki API , if it's possible. Thanks for your time. 回答1: Approach 3: Using an API in the Wikimedia wiki farm and Extension:Sitematrix https://commons.wikimedia.org/w/api.php?action=sitematrix&smtype=language While this will return all wikis, the matrix knows about, it is easily filtered client side