How to use wikipedia api if it exists? [closed]

爷,独闯天下 提交于 2019-11-27 02:36:51

You really really need to spend some time reading the documentation, as this took me a moment to look and click on the link to fix it. :/ but out of sympathy i'll provide you a link that maybe you can learn to use.

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

That's the variabled you will be looking to get. Your best bet is to know the page you will be after and replace the Wikipedia link part into the title i.e.:

http://en.wikipedia.org/wiki/New_York_Yankees [Take the part after the wiki/]

-->

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

[Place it in the title variable of the GET request.

The URL above can do with tweaking to get the different sections you do or do not want. So read the documentation :)

The answers here helped me arrive at a solution, but I discovered more info in the process which may be of advantage to others who find this question. I figure most people simply want to use the API to quickly get content off the page. Here is how I'm doing that:

Using Revisions:

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Threadless&rvprop=content&format=json&rvsection=0&rvparse=1

//Explanation
//Base Url:
http://en.wikipedia.org/w/api.php?action=query

//tell it to get revisions:
&prop=revisions

//define page titles separated by pipes. In the example i used t-shirt company threadless
&titles=whatever|the|title|is

//specify that we want the page content
&rvprop=content

//I want my data in JSON, default is XML
&format=json

//lets you choose which section you want. 0 is the first one.
&rvsection=0

//tell wikipedia to parse it into html for you
&rvparse=1

Using Extracts (better/easier for what i'm doing)

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Threadless&format=json&exintro=1

//only explaining new parameters
//instead of revisions, we'll set prop=extracts
&prop=extracts

//if we just want the intro, we can use exintro. Otherwise it shows all sections
&exintro=1

All the info requires reading through the API documentation as was mentioned, but I hope these examples will help the majority of the people who come here for a quick fix.

drdaeman

See http://www.mediawiki.org/wiki/API

Specifically, for the English Wikipedia, API is located at http://en.wikipedia.org/w/api.php

Antoine 'hashar' Musso

Have a look at the ApiSandbox at https://en.wikipedia.org/wiki/Special:ApiSandbox That is a web frontend to easily query the API. A few clicks will craft you the URL and show you the API result.

That is an extension for MediaWiki, enabled on all Wikipedia languages. https://www.mediawiki.org/wiki/Extension:ApiSandbox

Maksym Kozlenko

If you want to extract structured data from Wikipedia, you may consider using DbPedia http://dbpedia.org/

It provides means to query data using given criteria using SPARQL and returns data from parsed Wikipedia infobox templates

There are some SPARQL libraries available for multiple platforms to make queries easier

If you want to extract structured data from Wikipedia, you may also try http://www.wikidata.org/wiki/Wikidata:Main_Page

Below is a working example that prints the first sentence from Wikipedias New York Yankees page to your web browsers console:

<!DOCTYPE html>
</html>
    <head>
        <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
    </head>
    <body>
        <script>
            var wikiUrl = "http://en.wikipedia.org/w/api.php?action=opensearch&search=New_York_Yankees&format=json&callback=wikiCallbackFunction";

            $.ajax(wikiUrl, {
                dataType: "jsonp",
                success: function( wikiResponse ) {
                    console.log( wikiResponse[2][0] );
                }
            });
        </script>   
    </body>
</html>

http://en.wikipedia.org/w/api.php is the endpoint for your url. You can see how to structure your url by visiting: http://www.mediawiki.org/wiki/API:Main_page

I used jsonp as the dataType to allow cross-site requests. More can be found here: http://www.mediawiki.org/wiki/API:Cross-site_requests

Last but not least, make sure to reference the Jquery.ajax() API: http://api.jquery.com/jquery.ajax/

Wiki Parser converts Wikipedia dumps into XML. It is also quite fast. You can then use any XML processing tool to handle the data from the parsed Wikipedia articles.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!