I have a django application running with uwsgi (with 10 workers) + ngnix. I am using apscheduler for scheduling purpose. Whenever i schedule a job it is being executed multiple
Let's consider the following facts:
(1) UWSGI, by default, pre-loads your Django App into the UWSGI Master process' memory BEFORE forking its workers.
(2) UWSGI "forks" workers from the master, meaning they are essentially copied into the memory of each worker. Because of how fork()
is implemented, a Child process (i.e. a worker) does not inherit the threads of a Parent.
(3) When you call BackgroundScheduler.start()
, a thread is created which is responsible for executing jobs on whatever worker/master calls this function.
All you must do, is call BackgroundScheduler.start()
on the Master process, before any workers are created. By doing so, when the workers are created, they WILL NOT INHERIT the BackgroundScheduler thread (#2 above), and thus will not execute any jobs (but they still can schedule/modify/delete jobs by communicating with the jobstore!).
To do this, just make sure you call BackgroundScheduler.start()
in whatever function/module instantiates your app. For instance, in the following Django project structure, we'd (likely) want to execute this code in wsgi.py
, which is the entry point for the UWSGI server.:
mysite/
manage.py
mysite/
__init__.py
settings.py
urls.py
wsgi.py
Pitfalls:
Don't "initializ[e] apscheduler in urls of the django application.... This will start the scheduler when application starts." These may be loaded by each worker, and thus start()
is executed multiple times.
Don't start the UWSGI server in "lazy-app" mode, this will load the app in each of the workers, after they are created.
Don't run the BackgroundScheduler with the default (memory) jobstore. This will create split-brain syndrome between all workers. You want to enforce a single-point-of-truth, like you are with MongoDB, for all CRUD operations performed on jobs.
This post may give you more detail, only in a Gunicorn (WSGI server) environment.