Python download images with alernating variables

邮差的信 提交于 2020-01-05 05:36:22

问题


I was trying to download images with url's that change but got an error.

url_image="http://www.joblo.com/timthumb.php?src=/posters/images/full/"+str(title_2)+"-poster1.jpg&h=333&w=225"

user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)'
headers = {'User-Agent': user_agent}
req = urllib.request.Request(url_image, None, headers)


print(url_image)
#image, h = urllib.request.urlretrieve(url_image)
with urllib.request.urlopen(req) as response:
    the_page = response.read()

#print (the_page)


with open('poster.jpg', 'wb') as f:
    f.write(the_page)

Traceback (most recent call last): File "C:\Users\luke\Desktop\scraper\imager finder.py", line 97, in with urllib.request.urlopen(req) as response: File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 162, in urlopen return opener.open(url, data, timeout) File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 465, in open response = self._open(req, data) File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 483, in _open '_open', req) File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 443, in _call_chain result = func(*args) File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 1268, in http_open return self.do_open(http.client.HTTPConnection, req) File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 1243, in do_open r = h.getresponse() File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 1174, in getresponse response.begin() File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 282, in begin version, status, reason = self._read_status() File "C:\Users\luke\AppData\Local\Programs\Python\Python35-32\lib\http\client.py", line 264, in _read_status raise BadStatusLine(line) http.client.BadStatusLine:


回答1:


My advice is to use urlib2. In addition, I've written a nice function (I think) that will also allow gzip encoding (reduce bandwidth) if the server supports it. I use this for downloading social media files, but should work for anything.

I would try to debug your code, but since it's just a snippet (and the error messages are formatted badly), it's hard to know exactly where your error is occurring (it's certainly not line 97 in your code snippet).

This isn't as short as it could be, but it's clear and reusable. This is python 2.7, it looks like you're using 3 - in which case you google some other questions that address how to use urllib2 in python 3.

import urllib2
import gzip
from StringIO import StringIO

def download(url):
    """
    Download and return the file specified in the URL; attempt to use
    gzip encoding if possible.
    """
    request = urllib2.Request(url)
    request.add_header('Accept-Encoding', 'gzip')
    try:
        response = urllib2.urlopen(request)
    except Exception, e:
        raise IOError("%s(%s) %s" % (_ERRORS[1], url, e))
    payload = response.read()
    if response.info().get('Content-Encoding') == 'gzip':
        buf = StringIO(payload)
        f = gzip.GzipFile(fileobj=buf)
        payload = f.read()
    return payload

def save_media(filename, media):
    file_handle = open(filename, "wb")
    file_handle.write(media)
    file_handle.close()

title_2 = "10-cloverfield-lane"
media = download("http://www.joblo.com/timthumb.php?src=/posters/images/full/{}-poster1.jpg&h=333&w=225".format(title_2))
save_media("poster.jpg", media)


来源:https://stackoverflow.com/questions/38508715/python-download-images-with-alernating-variables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!