Race conditions in django
Solution 1
Django 1.4+ supports select_for_update, in earlier versions you may execute raw SQL queries e.g. select ... for update
which depending on underlying DB will lock the row from any updates, you can do whatever you want with that row until the end of transaction. e.g.
from django.db import transaction
@transaction.commit_manually()
def add_points(request):
user = User.objects.select_for_update().get(id=request.user.id)
# you can go back at this point if something is not right
if user.points > 1000:
# too many points
return
user.points += calculate_points(user)
user.save()
transaction.commit()
Solution 2
As of Django 1.1 you can use the ORM's F() expressions to solve this specific problem.
from django.db.models import F
user = request.user
user.points = F('points') + calculate_points(user)
user.save()
For more details see the documentation:
https://docs.djangoproject.com/en/1.8/ref/models/expressions/#django.db.models.F
Solution 3
Database locking is the way to go here. There are plans to add "select for update" support to Django (here), but for now the simplest would be to use raw SQL to UPDATE the user object before you start to calculate the score.
Pessimistic locking is now supported by Django 1.4's ORM when the underlying DB (such as Postgres) supports it. See the Django 1.4a1 release notes.
Solution 4
You have many ways to single-thread this kind of thing.
One standard approach is Update First. You do an update which will seize an exclusive lock on the row; then do your work; and finally commit the change. For this to work, you need to bypass the ORM's caching.
Another standard approach is to have a separate, single-threaded application server that isolates the Web transactions from the complex calculation.
Your web application can create a queue of scoring requests, spawn a separate process, and then write the scoring requests to this queue. The spawn can be put in Django's
urls.py
so it happens on web-app startup. Or it can be put into separatemanage.py
admin script. Or it can be done "as needed" when the first scoring request is attempted.You can also create a separate WSGI-flavored web server using Werkzeug which accepts WS requests via urllib2. If you have a single port number for this server, requests are queued by TCP/IP. If your WSGI handler has one thread, then, you've achieved serialized single-threading. This is slightly more scalable, since the scoring engine is a WS request and can be run anywhere.
Yet another approach is to have some other resource that has to be acquired and held to do the calculation.
A Singleton object in the database. A single row in a unique table can be updated with a session ID to seize control; update with session ID of
None
to release control. The essential update has to include aWHERE SESSION_ID IS NONE
filter to assure that the update fails when the lock is held by someone else. This is interesting because it's inherently race-free -- it's a single update -- not a SELECT-UPDATE sequence.A garden-variety semaphore can be used outside the database. Queues (generally) are easier to work with than a low-level semaphore.
Solution 5
This may be oversimplifying your situation, but what about just a JavaScript link replacement? In other words when the user clicks the link or button wrap the request in a JavaScript function which immediately disables / "greys out" the link and replaces the text with "Loading..." or "Submitting request..." info or something similar. Would that work for you?
Related videos on Youtube
Fragsworth
Developer of Clicker Heroes, Cloudstone, and other games http://www.clickerheroes.com/ http://www.kongregate.com/games/nexoncls/cloudstone http://armorgames.com/cloudstone-game/15364
Updated on June 28, 2021Comments
-
Fragsworth almost 3 years
Here is a simple example of a django view with a potential race condition:
# myapp/views.py from django.contrib.auth.models import User from my_libs import calculate_points def add_points(request): user = request.user user.points += calculate_points(user) user.save()
The race condition should be fairly obvious: A user can make this request twice, and the application could potentially execute
user = request.user
simultaneously, causing one of the requests to override the other.Suppose the function
calculate_points
is relatively complicated, and makes calculations based on all kinds of weird stuff that cannot be placed in a singleupdate
and would be difficult to put in a stored procedure.So here is my question: What kind of locking mechanisms are available to django, to deal with situations similar to this?
-
Fragsworth about 15 yearsI would prefer a "database-agnostic" solution if it is at all possible.
-
orokusaki almost 12 years
@transaction.commit_on_success
+QuerySet.select_for_update()
-
-
Van Gale about 15 yearsGreat answer. Somehow access to the database row has to be serialized and I think queues are more scalable than locks. @Fragsworth: see this project for a simple to use implementation of queues in Django that uses RabbitMQ: ask.github.com/celery/introduction.html
-
SashaN about 15 years-1 it still does not protect the site. time to time users are using other http clients than browsers. i.e. user might use wget to fetch given URL, then disabling URL by jscript won't save you. Jscript should be used just to make page user friednly if you want to, but you should not use it to fix problems within server side application.
-
Wayne Koorts about 15 years@SashaN: The poster didn't say that this wouldn't only be accessed through a web browser. We can't immediately assume all other exception cases like wget. I also prefixed the answer with "This may be oversimplifying your situation..." to cover the exception cases, as this suggestion may well be a suitable solution for many. Think also of future viewers of this question who may have a slightly different scenario in which this answer might be just the ticket. I certainly don't accept that it deserves a "not helpful" vote, but I do appreciate you at least providing a reason.
-
Jason Webb over 13 yearsThe
F()
expressions still don't allow you to add a conditional on the update. So you could say increase the users points if they are still active. -
Alex Lokk almost 11 yearsLooks like there was a patch for a long time for this feature code.djangoproject.com/ticket/2705 - I recently applied it to Django 1.3.5 (for a large project, which is hard to migrate to 1.4)
-
Ivan Virabyan over 10 yearsI wondering how this is best implemented as a method of the User class (to be reusable in other places, not just in that view). The problem for me is that calling code must still make select_for_update() call, but I'd like it to be incapsulated in the user's method.
-
Nandhini over 10 years@IvanVirabyan either add a specific method to
User
class e.g.get_user
but if you want to be more generic and want to override all objects queries write a customModelManager
-
RichVel over 10 yearsNote that Django 1.4's select for update will lock against rows from all tables in the query (SQL lets you specify a subset of table) - see groups.google.com/forum/#!topic/django-users/p1qnpz-S9xA. Good article on this approach, written before
select_for_update()
made it into Django 1.4 - coderanger.net/2011/01/select-for-update -
Reporter about 10 yearsAn explaination of your intention would be improve your answer.
-
Ekevoo about 9 years"Thou Shall Not Trust The Client Side"
-
NoobEditor almost 7 yearsnope...this would fail if you have update inside a for loop!
-
Mark Mishyn over 4 yearsYou also can use F() with update:
User.objects.filter(id=user.id).update(points=F('points') + points)