Content of infobox of Wikipedia

前端 未结 3 1608
轮回少年
轮回少年 2020-12-09 23:18

I need to get the content of an infobox of any movie. I know the name of the movie. One way is to get the complete content of a Wikipedia page and then parse it until I find

3条回答
  •  有刺的猬
    2020-12-09 23:38

    Another great MediaWiki parser is mwparserfromhell.

    In [1]: import mwparserfromhell
    
    In [2]: import pywikibot
    
    In [3]: enwp = pywikibot.Site('en','wikipedia')
    
    In [4]: page = pywikibot.Page(enwp, 'Waking Life')            
    
    In [5]: wikitext = page.get()               
    
    In [6]: wikicode = mwparserfromhell.parse(wikitext)
    
    In [7]: templates = wikicode.filter_templates()
    
    In [8]: templates?
    Type:       list
    String Form:[u'{{Use mdy dates|date=September 2012}}', u"{{Infobox film\n| name           = Waking Life\n| im <...> critic film|waking-life|Waking Life}}', u'{{Richard Linklater}}', u'{{DEFAULTSORT:Waking Life}}']
    Length:     31
    Docstring:
    list() -> new empty list
    list(iterable) -> new list initialized from iterable's items
    
    In [10]: templates[:2]
    Out[10]: 
    [u'{{Use mdy dates|date=September 2012}}',
     u"{{Infobox film\n| name           = Waking Life\n| image          = Waking-Life-Poster.jpg\n| image_size     = 220px\n| alt            =\n| caption        = Theatrical release poster\n| director       = [[Richard Linklater]]\n| producer       = [[Tommy Pallotta]]
    [[Jonah Smith]]
    Anne Walker-McBay
    Palmer West\n| writer = Richard Linklater\n| starring = [[Wiley Wiggins]]\n| music = Glover Gill\n| cinematography = Richard Linklater
    [[Tommy Pallotta]]\n| editing = Sandra Adair\n| studio = [[Thousand Words]]\n| distributor = [[Fox Searchlight Pictures]]\n| released = {{Film date|2001|01|23|[[Sundance Film Festival|Sundance]]|2001|10|19|United States}}\n| runtime = 101 minutes{{cite web |title=''WAKING LIFE'' (15) |url=http://www.bbfc.co.uk/releases/waking-life-2002-3|work=[[British Board of Film Classification]]|date=September 19, 2001|accessdate=May 6, 2013}}\n| country = United States\n| language = English\n| budget =\n| gross = $3,176,880{{cite web|title=''Waking Life'' (2001)|work=[[Box Office Mojo]] |url=http://www.boxofficemojo.com/movies/?id=wakinglife.htm|accessdate=March 20, 2010}}\n}}"] In [11]: infobox_film = templates[1] In [12]: for param in infobox_film.params: print param.name, param.value name Waking Life image Waking-Life-Poster.jpg image_size 220px alt caption Theatrical release poster director [[Richard Linklater]] producer [[Tommy Pallotta]]
    [[Jonah Smith]]
    Anne Walker-McBay
    Palmer West writer Richard Linklater starring [[Wiley Wiggins]] music Glover Gill cinematography Richard Linklater
    [[Tommy Pallotta]] editing Sandra Adair studio [[Thousand Words]] distributor [[Fox Searchlight Pictures]] released {{Film date|2001|01|23|[[Sundance Film Festival|Sundance]]|2001|10|19|United States}} runtime 101 minutes{{cite web |title=''WAKING LIFE'' (15) |url=http://www.bbfc.co.uk/releases/waking-life-2002-3|work=[[British Board of Film Classification]]|date=September 19, 2001|accessdate=May 6, 2013}} country United States language English budget gross $3,176,880{{cite web|title=''Waking Life'' (2001)|work=[[Box Office Mojo]] |url=http://www.boxofficemojo.com/movies/?id=wakinglife.htm|accessdate=March 20, 2010}}

    Don't forget that params are mwparserfromhell objects too!

提交回复
热议问题