问题
I have recently been using the wikipedia module to determine a random wikipedia page.
I have been doing this with a very large list of words, and the random.choice() module as so:
words=open("words.txt","r")
words=words.read()
words=words.split()
text=random.choice(words)
string=random.choice(wikipedia.search(text))
p = wikipedia.page(string)
The system appears to most often work, but will occasionally choke out the error:
Traceback (most recent call last):
File "/home/will/google4.py", line 25, in <module>
p = wikipedia.page(string)
File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 276, in page
return WikipediaPage(title, redirect=redirect, preload=preload)
File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 299, in __init__
self.__load(redirect=redirect, preload=preload)
File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 393, in __load
raise DisambiguationError(getattr(self, 'title', page['title']), may_refer_to)
DisambiguationError: "The Scarf" may refer to:
The Scarf (film)
The Scarf (opera)
Scarf (disambiguation)
Arthur Stewart King Scarf
Is there anyway by which I can bypass this?
回答1:
You could catch the DisambiguationError
and chose one of these pages randomly.
try:
p = wikipedia.page(string)
except wikipedia.DisambiguationError as e:
s = random.choice(e.options)
p = wikipedia.page(s)
see here: http://wikipedia.readthedocs.io/en/latest/quickstart.html
回答2:
One obvious way would be to download a complete list of Wikipedia page names and use that instead of your word list. That would also be much kinder to Wikipedia's search engine which you don't need to get a random page (and besides, if you want a uniform random page, you mustn't use the search engine).
A less-good but perhaps easier fix would be for you to simply try/except the DisambiguationError and try again.
回答3:
Better yet, use the tool at your disposal :
wikipedia.random(pages=1)
Get a list of random Wikipedia article titles.
Note
Random only gets articles from namespace 0, meaning no Category, User talk, or other meta-Wikipedia pages.
Keyword arguments:
pages - the number of random pages returned (max of 10)
(from https://wikipedia.readthedocs.io/en/latest/code.html#api)
来源:https://stackoverflow.com/questions/25946692/wikipedia-disambiguation-error