This sounds like it could be due to the way you've implemented in-memory storage. If it's not thread-safe the app might work fully in development, but when deployed with a WSGI server like gunicorn
with several worker processes/threads, each with their own memory, it could lead to strange behaviour as you descibe.
What's more, Heroku is quirky.
Here's the output of gunicorn --help
when installed on any-old-system through pip
which defaults to 1 worker if the -w
flag is not provided:
-w INT, --workers INT
The number of worker processes for handling requests. [1]
However when executed via the Heroku console, notice that it defaults to 2:
-w INT, --workers INT
The number of worker processes for handling requests. [2]
Heroku appear to have customised their gunicorn build for some reason (edit: figured out how), so the following Procfile launches with 2 workers:
web: gunicorn some:app
Where-as on a non-Heroku system this would launch with a single worker.
You'll probably find the following Procfile
will solve your issue:
web: gunicorn --workers 1 some:app
This is, of course, suitable if it's a small project which doesn't need to scale to several workers. To mitigate this issue and scale the application, you may need to investigate making code changes to implement a separate storage backend (eg. Redis) within your app.