How to use urllib to download image from web

后端 未结 1 1820
一向
一向 2020-12-03 13:12

I\'m trying to download an image using this code:

from urllib import urlretrieve
urlretrieve(\'http://gdimitriou.eu/wp-content/uploads/2008/04/google-image-         


        
相关标签:
1条回答
  • 2020-12-03 13:57

    If you used the following, you can download the image:

    wget http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg
    

    But if you did the following:

    from urllib import urlretrieve
    urlretrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg', 
                'Zindagi1976.jpg')
    

    You may not be able to download image. This may be the case because wikipedia may have rules (robot.txt) to deny robots or bots (unknown clients). Try emulating a browser.

    To do that you have to add the following as a part of header:

    ('User-agent', 
     'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) 
     Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')
    

    You can do something like this:

    >>> from urllib import FancyURLopener
    >>> class MyOpener(FancyURLopener):
    ...     version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
    ... 
    >>> myopener = MyOpener()
    >>> myopener.retrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg', 'Zindagi1976.jpg')
    ('Zindagi1976.jpg', <httplib.HTTPMessage instance at 0x1007bfe18>)
    

    This retrieves the file

    0 讨论(0)
提交回复
热议问题