Check for `urllib.urlretrieve(url, file_name)` Completion Status

三世轮回 提交于 2019-12-05 03:05:08

问题


How do I check to see if urllib.urlretrieve(url, file_name) has completed before allowing my program to advance to the next statement?

Take for example the following code snippet:

import traceback
import sys
import Image
from urllib import urlretrieve

try:
        print "Downloading gif....."
        urlretrieve(imgUrl, "tides.gif")
        # Allow time for image to download/save:
        time.sleep(5)
        print "Gif Downloaded."
    except:
        print "Failed to Download new GIF"
        raw_input('Press Enter to exit...')
        sys.exit()

    try:
        print "Converting GIF to JPG...."
        Image.open("tides.gif").convert('RGB').save("tides.jpg")
        print "Image Converted"
    except Exception, e:
        print "Conversion FAIL:", sys.exc_info()[0]
        traceback.print_exc()
        pass

When the download of 'tides.gif' via urlretrieve(imgUrl, "tides.gif") takes longer than time.sleep(seconds) resulting in an empty or not-complete file, Image.open("tides.gif") raises an IOError (due to a tides.gif file of size 0 kB).

How can I check the status of urlretrieve(imgUrl, "tides.gif"), allowing my program to advance only after the statement has been successfully completed?


回答1:


Requests is nicer than urllib but you should be able to do this to synchronously download the file:

import urllib
f = urllib.urlopen(imgUrl)
with open("tides.gif", "wb") as imgFile:
    imgFile.write(f.read())
# you won't get to this print until you've downloaded
# all of the image at imgUrl or an exception is raised
print "Got it!"

The downside of this is it will need to buffer the whole file in memory so if you're downloading a lot of images at once you may end up using a ton of ram. It's unlikely, but still worth knowing.




回答2:


I would use python requests from http://docs.python-requests.org/en/latest/index.html instead of plain urllib2. requests is synchronous by default so it won't progress to the next line of code without getting your image first.




回答3:


I found a similar question here: Why is "raise IOError("cannot identify image file")"showing up only part of the time?

To be more specific, look at the answer to the question. The user points to a couple of other threads that explain exactly how to solve the problem in multiple ways. The first one, which you may be interested in, includes a progress bar display.




回答4:


The selected answer doesn't work with big files. Here is the correct solution:

import sys
import time
import urllib


def reporthook(count, block_size, total_size):
    if int(count * block_size * 100 / total_size) == 100:
        print 'Download completed!'

def save(url, filename):
    urllib.urlretrieve(url, filename, reporthook)



回答5:


you can try this below :

import time

# ----------------------------------------------------
# Wait until the end of the download
# ----------------------------------------------------

valid=0
while valid==0:
    try:
        with open("tides.gif"):valid=1
    except IOError:
        time.sleep(1)

print "Got it !"

# ----------------------------------------------------
# //////////////////////////////////////////////////
# ----------------------------------------------------


来源:https://stackoverflow.com/questions/11595054/check-for-urllib-urlretrieveurl-file-name-completion-status

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!