Why python print is delayed?

巧了我就是萌 提交于 2020-01-17 02:53:10

问题


I am trying to download file using requests, and print a dot every time retrieve 100k size of file, but all the dots is printed out at the end. See code.

with open(file_name,'wb') as file:
    print("begin downloading, please wait...")
    respond_file = requests.get(file_url,stream=True)
    size = len(respond_file.content)//1000000

    #the next line will not be printed until file is downloaded
    print("the file size is "+ str(size) +"MB")
    for chunk in respond_file.iter_content(102400):
        file.write(chunk)
        #print('',end='.')
        sys.stdout.write('.')
        sys.stdout.flush()
    print("")

回答1:


You are accessing request.content here:

size = len(respond_file.content)//1000000

Accessing that property forces the whole response to be downloaded, and for large responses this takes some time. Use int(respond_file.headers['content-length']) instead:

size = int(respond_file.headers['content-length']) // 1000000

The Content-Length header is provided by the server and since it is part of the headers you have access to that information without downloading all of the content first.

If the server chooses to use Transfer-Encoding: chunked to stream the response, no Content-Length header has to be set; you may need to take that into account:

content_length = respond_file.headers.get('content-length', None)
size_in_kb = '{}KB'.format(int(content_length) // 1024) if content_length else 'Unknown'
print("the file size is", size_in_kb)

where the size in kilobytes is calculated by dividing the length by 1024, not 1 million.

Alternatively, ask for the size in a separate HEAD request (only fetching the headers):

head_response = requests.get(file_url)
size = int(head_response.headers.get('content-length', 0))



回答2:


This should work how you expect. Getting the length of respond_file is not what you wanted. Instead check the content-length header.

Note: I changed the code to display KB instead (for the purposes of testing).

import requests
import sys

file_url = "https://github.com/kennethreitz/requests/archive/master.zip"
file_name = "out.zip"

with open(file_name,'wb') as file:
    print("begin downloading, please wait...")
    respond_file = requests.get(file_url,stream=True)
    size = int(respond_file.headers['content-length'])//1024

    #the next line will not be printed until file is downloaded
    print("the file size is "+ str(size) +"KB")
    for chunk in respond_file.iter_content(1024):
        file.write(chunk)
        #print('',end='.')
        sys.stdout.write('.')
        sys.stdout.flush()
    print("")



回答3:


As @kevin writes in a comment, respond.file.content blocks the execution until the whole content is downloaded. The only difference between my answer and his comment is that I'm not guessing ;)



来源:https://stackoverflow.com/questions/30056960/why-python-print-is-delayed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!