Wikipedia disambiguation error

对着背影说爱祢 提交于 2020-02-24 11:04:45

问题


I have recently been using the wikipedia module to determine a random wikipedia page.

I have been doing this with a very large list of words, and the random.choice() module as so:

words=open("words.txt","r")
words=words.read()

words=words.split()    

text=random.choice(words)

string=random.choice(wikipedia.search(text))

p = wikipedia.page(string)

The system appears to most often work, but will occasionally choke out the error:

Traceback (most recent call last):
  File "/home/will/google4.py", line 25, in <module>
    p = wikipedia.page(string)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 276, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 299, in __init__
    self.__load(redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 393, in __load
    raise DisambiguationError(getattr(self, 'title', page['title']), may_refer_to)
DisambiguationError: "The Scarf" may refer to: 
The Scarf (film)
The Scarf (opera)
Scarf (disambiguation)
Arthur Stewart King Scarf  

Is there anyway by which I can bypass this?


回答1:


You could catch the DisambiguationError and chose one of these pages randomly.

try:
    p = wikipedia.page(string)
except wikipedia.DisambiguationError as e:
    s = random.choice(e.options)
    p = wikipedia.page(s)

see here: http://wikipedia.readthedocs.io/en/latest/quickstart.html




回答2:


One obvious way would be to download a complete list of Wikipedia page names and use that instead of your word list. That would also be much kinder to Wikipedia's search engine which you don't need to get a random page (and besides, if you want a uniform random page, you mustn't use the search engine).

A less-good but perhaps easier fix would be for you to simply try/except the DisambiguationError and try again.




回答3:


Better yet, use the tool at your disposal :

wikipedia.random(pages=1)

Get a list of random Wikipedia article titles.

Note

Random only gets articles from namespace 0, meaning no Category, User talk, or other meta-Wikipedia pages.

Keyword arguments:

    pages - the number of random pages returned (max of 10)

(from https://wikipedia.readthedocs.io/en/latest/code.html#api)



来源:https://stackoverflow.com/questions/25946692/wikipedia-disambiguation-error

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!