Python stream from FTP server to Flask server for downloading

£可爱£侵袭症+ 提交于 2019-12-25 02:22:10

问题


I have a Python Flask app that gets request to download a file from a remote FTP server. I have used BytesIO to save contents of the file downloaded from FTP server using retrbinary:

import os

from flask import Flask, request, send_file
from ftplib import FTP
from io import BytesIO

app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, World!'

@app.route('/download_content', methods=['GET'])
def download_content():
    filepath = request.args.get("filepath").strip()
    f = FTP(my_server)
    f.login(my_username, my_password)
    b = BytesIO()
    f.retrbinary("RETR " + filepath, b.write)
    b.seek(0)
    return send_file(b, attachment_filename=os.path.basename(filepath))

app.run("localhost", port=8080)

The issue here is that when the download_content route is hit, first the contents of the file comes in the BytesIO object, then it is sent to the frontend for downloading.

How can I stream the file to frontend while it is being downloading from FTP server? I can't wait for the file to get downloaded entirely in BytesIO object and then do a send_file, as that could be both, memory inefficient as well as more time consuming.

I have read that Flask's send_file accepts a generator object, but how can I make the BytesIO object yield to send_file in chunks?


回答1:


It looks like you will need to setup a worker thread to manage the downloading from retrbinary

I have made a quick Gist for this as we have come across the same problem. This method seems to work.

https://gist.github.com/Richard-Mathie/ffecf414553f8ca4c56eb5b06e791b6f

class FTPDownloader(object):
  def __init__(self, host, user, password, timeout=0.01):
    self.ftp = FTP(host)
    self.ftp.login(user, password)
    self.timeout = timeout

  def getBytes(self, filename):
    print("getBytes")
    self.ftp.retrbinary("RETR {}".format(filename) , self.bytes.put)
    self.bytes.join()   # wait for all blocks in the queue to be processed
    self.finished.set() # mark streaming as finished

  def sendBytes(self):
    while not self.finished.is_set():
      try:
        yield self.bytes.get(timeout=self.timeout)
          self.bytes.task_done()
      except Empty:
        self.finished.wait(self.timeout)
    self.worker.join()

  def download(self, filename):
    self.bytes = Queue()
    self.finished = Event()
    self.worker = Thread(target=self.getBytes, args=(filename,))
    self.worker.start()
    return self.sendBytes()

Should probably add some timeouts and logic to handle connections timing out ect, but this is the basic form.

Explanation

Queues don't guarantee that the worker process getBytes has finished when they are empty so you have to have a semaphore/Event to indicate to the generator sendBytes when the worker has finished. However I have to wait for all the blocks in the queue to be processed first hence the self.bytes.join() before setting finished.

Interested if anyone can think of more elegant way of doing this.



来源:https://stackoverflow.com/questions/51024276/python-stream-from-ftp-server-to-flask-server-for-downloading

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!