问题
I've tried to get the image from the following url.
http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg
I can do right-click and save-as but when I tried to use urlretrieve like
import urllib
img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg'
urllib.urlretrieve( img_url, 'cover.jpg')
I found that it is html instead of .jpg image but I don't know why. Could you please tell me why does my method not work? Are there any option that can mimic right-click save-as method?
回答1:
You can use Requests, if you havn't installed yet, pip install requests
Because this img_url
was redirected by the server to another html page ( that was the html page you just downloaded) if you didn't provide a referer
header.
So the following code first find the redirect url, and add it to the HTTP Referer header.
import requests
img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg'
r = requests.get(img_url, allow_redirects=False) # stop redirect 302 , capture redirects url
headers = {}
headers['Referer'] = r.headers['location'] # add this url to referer 'http://upic.me/show/55132055'
r = requests.get(img_url, headers=headers)
filename = img_url.split('/')[-1] # find the file name in `img_url`
with open(filename, 'wb') as fh: # use 'wb' to write in binary mode
fh.write(r.content)
回答2:
try like this:
import urllib2
image = urllib2.urlopen('http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg').read()
f = open('some_name.jpg','w')
f.write(image)
f.close()
来源:https://stackoverflow.com/questions/29433699/try-to-scrape-image-from-image-url-using-python-urllib-but-get-html-instead