I\'m trying to download an image using this code:
from urllib import urlretrieve
urlretrieve(\'http://gdimitriou.eu/wp-content/uploads/2008/04/google-image-
If you used the following, you can download the image:
wget http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg
But if you did the following:
from urllib import urlretrieve
urlretrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg',
'Zindagi1976.jpg')
You may not be able to download image. This may be the case because wikipedia may have rules (robot.txt) to deny robots or bots (unknown clients). Try emulating a browser.
To do that you have to add the following as a part of header:
('User-agent',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1)
Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')
You can do something like this:
>>> from urllib import FancyURLopener
>>> class MyOpener(FancyURLopener):
... version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
...
>>> myopener = MyOpener()
>>> myopener.retrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg', 'Zindagi1976.jpg')
('Zindagi1976.jpg', <httplib.HTTPMessage instance at 0x1007bfe18>)
This retrieves the file