I\'m building a web service for iterative batch processing of data using CherryPy. The ideal workflow is as follows:
Don't run a background task using the BackgroundTask
solution, because it will run in a thread and, due to the GIL, cherrypy won't be able to answer new requests. Use a queue solution that runs your background tasks in a different process, like Celery or RQ.
I'm going to develop in detail an example using RQ. RQ uses Redis as a message broker, so first of all you need to install and start Redis.
Then create a module (mytask
in my example) with the long time running background methods:
import time
def long_running_task(value):
time.sleep(15)
return len(value)
Start one (or more than one if you want to run tasks in parallel) RQ workers, it's important that the python that is running your workers has access to your mytask
module (export the PYTHONPATH before running the worker if your module it's not already in the path):
# rq worker
Above you have a very simple cherrypy webapp that shows how to use the RQ queue:
import cherrypy
from redis import Redis
from rq import Queue
from mytask import long_running_task
class BackgroundTasksWeb(object):
def __init__(self):
self.queue = Queue(connection=Redis())
self.jobs = []
@cherrypy.expose
def index(self):
html = ['', '']
html += ['"]
html += ['']
html += ['', '']
return '\n'.join(html)
@cherrypy.expose
def results(self):
html = ['', '', '', '', '']
html += ['']
html += ['- job:{} status:{} result:{} input:{}
'.format(j.get_id(), j.get_status(), j.result, j.args[0]) for j in self.jobs]
html += ['
']
html += ['', '']
return '\n'.join(html)
@cherrypy.expose
def job(self, q):
job = self.queue.enqueue(long_running_task, q)
self.jobs.append(job)
raise cherrypy.HTTPRedirect("/")
cherrypy.quickstart(BackgroundTasksWeb())
In a production webapp I would use jinja2 template engine to generate the html, and most likely websockets to update the job status in the web browser.