Retrieving image license and author information in wiki commons

后端 未结 6 978
情深已故
情深已故 2021-02-19 07:04

I am trying to use the wikimedia API for wiki commons at:

http://commons.wikimedia.org/w/api.php

It seems like the commons API is very immature

相关标签:
6条回答
  • 2021-02-19 07:26

    see page: http://www.mediawiki.org/wiki/API:Meta

    You can use foreach image the tag 'meta=siteinfo' and the tag 'siprop=rightsinfo' (siprop is the prop of the siteinfo) Then you will see the rightsinfo of the picture.

    In your case of Brad Pitt it would be like:

    http://en.wikipedia.org/w/api.php?format=jsonfm&action=query&titles=File:Brad_Pitt_at_Incirlik2.jpg&prop=imageinfo&iiprop=url&meta=siteinfo&siprop=rightsinfo

    0 讨论(0)
  • 2021-02-19 07:26

    Late answer but you can request the "extmetadata" data with the following query:

    http://en.wikipedia.org/w/api.php?action=query&prop=imageinfo&iiprop=extmetadata&titles=File%3aBrad_Pitt_at_Incirlik2.jpg&format=json

    Look under imageinfo.extmetadata.UsageTerms, Artist, Credit, etc.

    0 讨论(0)
  • 2021-02-19 07:26

    I've used Magnus' Commons API tool. It's not designed to be just dropped into a project, but if you copy the source of the wiki page it calls and cache it locally, then move the logic into a class you can make it more easily callable. Here's the source for Magnus' version. If you want the class I created from it let me know and I'll dig it out.

    0 讨论(0)
  • 2021-02-19 07:30

    From http://www.mediawiki.org/wiki/API_talk:Main_page#Image_license_information Is there a way to get the license of an image through the api? By category is probably easiest, assuming the site categorizes by license. There is no built in module though for license information. Splarka 08:45, 22 January 2010 (UTC)

    However, I find that using categories doesn't return anything for many images even though they have a license specified. Maybe the best way is to parse the rendered html of the image page.

    0 讨论(0)
  • 2021-02-19 07:31

    have a look at Mediawiki and try this function:

    import json, requests
    def extract_image_license(image_name):
    
        start_of_end_point_str = 'https://commons.wikimedia.org' \
                             '/w/api.php?action=query&titles=File:'
        end_of_end_point_str = '&prop=imageinfo&iiprop=user' \
                           '|userid|canonicaltitle|url|extmetadata&format=json'
        result = requests.get(start_of_end_point_str + image_name+end_of_end_point_str)
        result = result.json()
        page_id = next(iter(result['query']['pages']))
        image_info = result['query']['pages'][page_id]['imageinfo']
    
        return image_info
    

    then you call the function and pass in the image name you want to query for example:

    extract_image_license('Albert_Einstein_Head.jpg')
    
    0 讨论(0)
  • 2021-02-19 07:36

    You could try using Magnus Manske's Commons API tool on the Wikimedia Toolserver. It's not an official service, and the documentation seem to be rather sparse (that is to say, almost nonexistent), but the XML output seems pretty self-explanatory.

    I can't seem to find the source for Magnus's script anywhere, but I assume it extracts the licensing information from the categories the file belongs to. If you wanted, you could do that yourself: just fetch the list of categories and, if necessary, walk up the category tree until you find a license category you recognize. Alas, the tree-walking part requires either multiple API requests or a database of Commons categories (either live access on the Toolserver, or a reconstructed copy from the database dumps).

    Yes, I realize that this answer may seem unsatisfactory. The fact is that Magnus's script seems to be the closest currently existing thing to what you want, and even it's marked as experimental and incomplete. Basically, this is a problem waiting for someone to implement a (better) solution.

    0 讨论(0)
提交回复
热议问题