How to use wikipedia api if it exists? [closed]

空扰寡人 提交于 2019-11-26 10:07:08

问题


I\'m trying to find out if there\'s a wikipedia api (I Think it is related to the mediawiki?).

If so, I would like to know how I would tell wikipedia to give me an article about the new york yankees for example.

What would the REST url be for this example?

All the docs on this subject seem fairly complicated.


回答1:


You really really need to spend some time reading the documentation, as this took me a moment to look and click on the link to fix it. :/ but out of sympathy i'll provide you a link that maybe you can learn to use.

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

That's the variabled you will be looking to get. Your best bet is to know the page you will be after and replace the Wikipedia link part into the title i.e.:

http://en.wikipedia.org/wiki/New_York_Yankees [Take the part after the wiki/]

-->

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

[Place it in the title variable of the GET request.

The URL above can do with tweaking to get the different sections you do or do not want. So read the documentation :)




回答2:


The answers here helped me arrive at a solution, but I discovered more info in the process which may be of advantage to others who find this question. I figure most people simply want to use the API to quickly get content off the page. Here is how I'm doing that:

Using Revisions:

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Threadless&rvprop=content&format=json&rvsection=0&rvparse=1

//Explanation
//Base Url:
http://en.wikipedia.org/w/api.php?action=query

//tell it to get revisions:
&prop=revisions

//define page titles separated by pipes. In the example i used t-shirt company threadless
&titles=whatever|the|title|is

//specify that we want the page content
&rvprop=content

//I want my data in JSON, default is XML
&format=json

//lets you choose which section you want. 0 is the first one.
&rvsection=0

//tell wikipedia to parse it into html for you
&rvparse=1

Using Extracts (better/easier for what i'm doing)

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Threadless&format=json&exintro=1

//only explaining new parameters
//instead of revisions, we'll set prop=extracts
&prop=extracts

//if we just want the intro, we can use exintro. Otherwise it shows all sections
&exintro=1

All the info requires reading through the API documentation as was mentioned, but I hope these examples will help the majority of the people who come here for a quick fix.




回答3:


See http://www.mediawiki.org/wiki/API

Specifically, for the English Wikipedia, API is located at http://en.wikipedia.org/w/api.php




回答4:


Have a look at the ApiSandbox at https://en.wikipedia.org/wiki/Special:ApiSandbox That is a web frontend to easily query the API. A few clicks will craft you the URL and show you the API result.

That is an extension for MediaWiki, enabled on all Wikipedia languages. https://www.mediawiki.org/wiki/Extension:ApiSandbox




回答5:


If you want to extract structured data from Wikipedia, you may consider using DbPedia http://dbpedia.org/

It provides means to query data using given criteria using SPARQL and returns data from parsed Wikipedia infobox templates

There are some SPARQL libraries available for multiple platforms to make queries easier




回答6:


If you want to extract structured data from Wikipedia, you may also try http://www.wikidata.org/wiki/Wikidata:Main_Page




回答7:


Below is a working example that prints the first sentence from Wikipedias New York Yankees page to your web browsers console:

<!DOCTYPE html>
</html>
    <head>
        <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
    </head>
    <body>
        <script>
            var wikiUrl = "http://en.wikipedia.org/w/api.php?action=opensearch&search=New_York_Yankees&format=json&callback=wikiCallbackFunction";

            $.ajax(wikiUrl, {
                dataType: "jsonp",
                success: function( wikiResponse ) {
                    console.log( wikiResponse[2][0] );
                }
            });
        </script>   
    </body>
</html>

http://en.wikipedia.org/w/api.php is the endpoint for your url. You can see how to structure your url by visiting: http://www.mediawiki.org/wiki/API:Main_page

I used jsonp as the dataType to allow cross-site requests. More can be found here: http://www.mediawiki.org/wiki/API:Cross-site_requests

Last but not least, make sure to reference the Jquery.ajax() API: http://api.jquery.com/jquery.ajax/




回答8:


Wiki Parser converts Wikipedia dumps into XML. It is also quite fast. You can then use any XML processing tool to handle the data from the parsed Wikipedia articles.



来源:https://stackoverflow.com/questions/964454/how-to-use-wikipedia-api-if-it-exists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!