Unable to save image from web using urllib2

蓝咒 提交于 2019-12-23 19:59:16

问题


I want to save some images from a website using python urllib2 but when I run the code it saves something else.

This is my code:

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers = { 'User-Agent' : user_agent }
url = "http://m.jaaar.com/"
r = urllib2.Request(url, headers=headers)
page = urllib2.urlopen(r).read()

soup = BeautifulSoup(page)
imgTags = soup.findAll('img')
imgTags = imgTags[1:]


for imgTag in imgTags:
    imgUrl = "http://www.jaaar.com" + imgTag['src']
    imgUrl = imgUrl[0:-10] + imgUrl[-4:]
    fileName = "khabarnak-" + imgUrl[-12:]
    print fileName

    imgData = urllib2.urlopen(imgUrl).read()
    print imgUrl

    output = open("C:\wamp\www\py\pishkhan\\" + fileName,'wb')
    output.write(imgData)
    output.close()

Any suggestions?


回答1:


The site is returning a standard image back to you because you are scraping the site. Use the same 'trick' of setting the headers when retrieving the image:

imgRequest = urllib2.Request(imgUrl, headers=headers)
imgData = urllib2.urlopen(imgRequest).read()


来源:https://stackoverflow.com/questions/14439809/unable-to-save-image-from-web-using-urllib2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!