Celery and transaction.atomic

11,661

Solution 1

"Separate task" = something that is ran by a worker.

"Celery worker" = another process.

I am not aware of any method that would let you have a single database transaction shared between 2 or more processess. What you want is to run the task in a synchronous way, in that transaction, and wait for the result... but, if that's what you want, why do you need a task queue anyway?

Solution 2

As @dotz mentioned, it is hardly useful to spawn an async task and immediately block and keep waiting until it finishes.

Moreover, if you attach to it this way (the .get() at the end), you can be sure that the mymodel instance changes just made won't be seen by your worker because they won't be committed yet - remember you're still inside the atomic block.

What you could do instead (from Django 1.9) is delay the task until after the current active transaction is committed, using django.db.transaction.on_commit hook:

from django.db import transaction

with transaction.atomic():
    mymodel.save()
    transaction.on_commit(lambda:
        mytask.delay(mymodel.id))

I use this pattern quite often in my post_save signal handlers that trigger some processing of new model instances. For example:

from django.db import transaction
from django.db.models.signals import post_save
from django.dispatch import receiver
from . import models   # Your models defining some Order model
from . import tasks   # Your tasks defining a routine to process new instances

@receiver(post_save, sender=models.Order)
def new_order_callback(sender, instance, created, **kwargs):
    """ Automatically triggers processing of a new Order. """
    if created:
        transaction.on_commit(lambda:
            tasks.process_new_order.delay(instance.pk))

This way, however, your task won't be executed if the database transaction fails. It is usually the desired behavior, but keep it in mind.

Edit: It's actually nicer to register the on_commit celery task this way (w/o lambda):

transaction.on_commit(tasks.process_new_order.s(instance.pk).delay)

Solution 3

There are some race conditions in Celery tasks. I think here is an explanation of your problem.

Take a look into those docs. Also there are some packages like django-celery-transactions that can help you with your question.

Share:
11,661
user85461
Author by

user85461

Updated on June 06, 2022

Comments

  • user85461
    user85461 about 2 years

    In some Django views, I used a pattern like this to save changes to a model, and then to do some asynchronous updating (such as generating images, further altering the model) based on the new model data. mytask is a celery task:

    with transaction.atomic():
        mymodel.save()
        mytask.delay(mymodel.id).get()
    

    The problem is that the task never returns. Looking at celery's logs, the task gets queued (I see "Received task" in the log), but it never completes. If I move the mytask.delay...get call out of the transaction, it completes successfully.

    Is there some incompatibility between transaction.atomic and celery? Is it possible in Django 1.6 or 1.7 for me to have both regular model updates and updates from a separate task process under one transaction?

    My database is postgresql 9.1. I'm using celery==3.1.16 / django-celery 3.1.16, amqp==1.4.6, Django==1.6.7, kombu==3.0.23. The broker backend is amqp, and rabitmq as the queue.

  • user85461
    user85461 over 9 years
    Thanks -- those docs do point to a potential source of the problem. django-celery-transactions looks good if I want to skip the async task if the regular model update raises an exception -- but I'm looking for a way to rollback the whole transaction if the async task fails. As far as I can tell, django-celery-transactions leaves the work in the task to a separate transaction.
  • zmbq
    zmbq over 8 years
    Well, on Windows the DTC supports it.
  • user85461
    user85461 over 7 years
    Why I need the task queue: stackoverflow.com/a/7543682/85461. Because of an esoteric problem with OS signals from java subprocesses interacting with WSGI, it can be necessary to use a separate process sometimes even when you need a synchronous result.
  • valex
    valex about 5 years
    Thank you so much for Celery signature. For some reason syntax with lambda is producing incorrect behaviour - Celery runs tasks with wrong arguments (doubles same one).
  • WhyNotHugo
    WhyNotHugo about 4 years
    Lambdas work fine for me, but this .s pattern is a lot cleaner! Thanks!