I have a simple function that go over a list of URLs, using GET
to retrieve some information and update the DB (PostgresSQL
) accordingly. The funct
Though using Celery may seem an overkill, it is a well-known way of doing asynchronous tasks. Essentially Django serves WSGI request-response cycle which knows nothing of multiprocessing or background tasks.
Here are alternative options:
Currently I have a function (view) that go over each URL to get the information, and update the DB.
It means response time does not matter for you and instead of doing it in the background (asynchronously), you are OK with doing it in the foreground if your response time is cut by 4 (using 4 sub-processes/threads). If that is the case you can simply put your sample code in your view. Like
from multiprocessing import Pool
def updateDB(ip):
code goes here...
def my_view(request):
pool = Pool(processes=4) # process per core
pool.map(updateDB, ip)
return HttpResponse("SUCCESS")
But, if you want to do it asynchronously in the background then you should use Celery or follow one of @BasicWolf's suggestions.
I will recommend to use gevent for multithreading solution instead of multiprocessing. Multiprocessing can cause problem in production environment where spawning new processes are restricted.
Example code:
from django.shortcuts import HttpResponse
from gevent.pool import Pool
def square(number):
return number * number
def home(request):
pool = Pool(50)
numbers = [1, 3, 5]
results = pool.map(square, numbers)
return HttpResponse(results)