Django related objects are missing from celery task (race condition?)

后端 未结 3 1272
囚心锁ツ
囚心锁ツ 2021-01-24 22:03

Strange behavior, that I don\'t know how to explain. I\'ve got a model, Track, with some related points. I call a celery task to performs some calculat

相关标签:
3条回答
  • 2021-01-24 22:20

    You should NEVER pass model objects to celery tasks. This is because the session might expire (or be different) in the celery task compared to your Django application and this object will not be linked to the session and thus may not be available/beheave badly. What you should do is send the id. So something like track_id and then get the object from the database by issuing a query. That should most likely solve your problem.

    @shared_task
    def my_task(track_id):
        track = Track.query.get(track_id)  # Or how ever the query should be
        print 'in the task', track.id, track.points.all().count()
    
    def some_method():
        t = Track()
        t.save()
        t = fill_with_points(t)  # creating points, attaching them to a Track
        t.save()
        print 'before the task', track.id, track.points.all().count()
        my_task.delay(t.id)  # Pass the id here, not the object
    
    0 讨论(0)
  • 2021-01-24 22:33

    So, I've solved it using django-transaction-hooks. It still looks kinda scary to replace my DB backend, but django-celery-transactions seems to be broken in Django 1.6. Now my setup looks like this:

    settings.py:

    DATABASES = {
        'default': {
            'ENGINE': 'transaction_hooks.backends.postgresql_psycopg2',
            'NAME': 'foo',
            },
        }
    SOUTH_DATABASE_ADAPTERS = {'default':'south.db.postgresql_psycopg2'}  # this is required, or South breaks
    

    models.py:

    from django.db import connection
    
    @shared_task
    def my_task(track):
        print 'in the task', track.id, track.points.all().count()
    
    def some_method():
        t = Track()
        t.save()
        t = fill_with_points(t)  # creating points, attaching them to a Track
        t.save()
        print 'before the task', track.id, track.points.all().count()
        connection.on_commit(lambda: my_task.delay(t))
    

    Results:

    before the task, 21346, 2971
    in the task, 21346, 2971
    

    It still seems strange that such a common use case has no native celery or Django solution.

    0 讨论(0)
  • 2021-01-24 22:37

    I'm going to assume this is due to transaction isolation.

    Django transactions by default are tied to requests; and while a transaction is active, no other process will see the changes until the transaction is committed. If you're in the middle of a save method, and there are quite a lot of other actions that take place before the request finishes, it seems likely that Celery starts processing the task before the transaction is committed. You could fix this by committing manually or by delaying the task.

    0 讨论(0)
提交回复
热议问题