So I\'m trying to make a Python script that downloads webcomics and puts them in a folder on my desktop. I\'ve found a few similar programs on here that do something simila
It's easiest to just use .read()
to read the partial or entire response, then write it into a file you've opened in a known good location.
I have found this answer and I edit that in more reliable way
def download_photo(self, img_url, filename):
try:
image_on_web = urllib.urlopen(img_url)
if image_on_web.headers.maintype == 'image':
buf = image_on_web.read()
path = os.getcwd() + DOWNLOADED_IMAGE_PATH
file_path = "%s%s" % (path, filename)
downloaded_image = file(file_path, "wb")
downloaded_image.write(buf)
downloaded_image.close()
image_on_web.close()
else:
return False
except:
return False
return True
From this you never get any other resources or exceptions while downloading.
Aside from suggesting you read the docs for retrieve()
carefully (http://docs.python.org/library/urllib.html#urllib.URLopener.retrieve), I would suggest actually calling read()
on the content of the response, and then saving it into a file of your choosing rather than leaving it in the temporary file that retrieve creates.
Python 3 version of @DiGMi's answer:
from urllib import request
f = open('00000001.jpg', 'wb')
f.write(request.urlopen("http://www.gunnerkrigg.com/comics/00000001.jpg").read())
f.close()
What about this:
import urllib, os
def from_url( url, filename = None ):
'''Store the url content to filename'''
if not filename:
filename = os.path.basename( os.path.realpath(url) )
req = urllib.request.Request( url )
try:
response = urllib.request.urlopen( req )
except urllib.error.URLError as e:
if hasattr( e, 'reason' ):
print( 'Fail in reaching the server -> ', e.reason )
return False
elif hasattr( e, 'code' ):
print( 'The server couldn\'t fulfill the request -> ', e.code )
return False
else:
with open( filename, 'wb' ) as fo:
fo.write( response.read() )
print( 'Url saved as %s' % filename )
return True
##
def main():
test_url = 'http://cdn.sstatic.net/stackoverflow/img/favicon.ico'
from_url( test_url )
if __name__ == '__main__':
main()
A simpler solution may be(python 3):
import urllib.request
import os
os.chdir("D:\\comic") #your path
i=1;
s="00000000"
while i<1000:
try:
urllib.request.urlretrieve("http://www.gunnerkrigg.com//comics/"+ s[:8-len(str(i))]+ str(i)+".jpg",str(i)+".jpg")
except:
print("not possible" + str(i))
i+=1;