Extract statistical information from Wikipedia article

十年热恋 提交于 2019-12-24 09:57:56

问题


I'm currently extracting data from DBpedia articles using a SPARQLWrapper for python, but I can't seem to find how to extract the number of watchers (and other statistical information) for a given article.

Is there an easy way to achieve this? I don't mind if it's through DBpedia, or directly through wikipedia (using wget, for example).

Thanks for any advice.


回答1:


It shell be prohibited to get the number of watchers for every arbitrary article, as it is considered to be a security leak if everyone could find unwatched pages. For example, only privileged users have access to Special:Unwatched Pages. There is a toolserver tool (which has access to the DB) showing the number of watchers, but it is restricted to pages with more than 30 watchers for the same reasons - at least unauthenticated.

The MediaWiki query API exposes only mostly content and status information about articles, though you can query and evaluate the public logs or revision histories as well to get statistical data about (public) user actions. For more stats about the Wikimedia sites you may have a look at Meta:Statistics, where various data sources (mostly http://stats.wikimedia.org/) and visualisations of them are listed.



来源:https://stackoverflow.com/questions/13964017/extract-statistical-information-from-wikipedia-article

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!