Take a look at the concurrency problems with Django(older versions) and their basic workarounds
Juejin. Cn/post / 684490…
The days of single-user desktop systems are over – web applications are now serving millions of users, and many users have a wide new problem – concurrency.
In this article, I’ll cover two ways to manage concurrency in the Django model. To demonstrate common concurrency problems, we’ll use the bank account model:
class Account(models.Model) :
id = models.AutoField( # the user id
primary_key=True,
)
user = models.ForeignKey( # the user
User,
)
balance = models.IntegerField( # Deposit amount
default=0.)Copy the code
To start, we provide a simple deposit and withdrawal method for the account instance:
# deposit
def deposit(self, amount) :
self.balance += amount
self.save()
# take money
def withdraw(self, amount) :
if amount > self.balance:
raise errors.InsufficientFunds()
self.balance -= amount
self.save()
Copy the code
This seems simple enough, and might even pass localhost unit testing and integration testing. But what happens when two users perform operations on the same account at the same time?
1, User A extracts the account30$- The initial balance is100$.2Account of user B30$- The initial balance is100$.3After user B draws - the balance is updated as100$-30$=70$.4After user A deposits - the balance is updated to100+ $50$=150$.Copy the code
What’s going on here?
User B asked to withdraw 30, and user A deposited 50. We expected A balance of 120, but ended up with 150.
Why is that?
In step 4, when user A updates the balance, the amount he has stored in memory is out of date (user B has logged out $30). To prevent this from happening, we need to make sure that the resources we are working with do not change as we are calculating.
Pessimistic approach
The pessimistic approach suggests that you should completely lock down the resource until it is complete. If no one can get a lock on an object while you process it, you can be sure that the object has not been changed. We use database locks for several reasons:
1. Databases are very good at managing locks and maintaining consistency.
2. The database is the lowest level of access to data – obtaining the lowest level of lock also prevents other processes from trying to modify the data. For example, direct updates in DB, cron jobs, cleanup tasks, etc.
Django applications can run on multiple processes, such as workers. Maintaining locks at the application level will require a lot of (unnecessary) work.
To lock an object in Django, use select_for_update. Let’s take a pessimistic approach to practicing safe deposits and withdrawals:
@classmethod
def deposit(cls, id, amount) :
with transaction.atomic():
account = (
cls.objects
.select_for_update()
.get(id=id)
)
account.balance += amount
account.save()
return account
@classmethod
def withdraw(cls, id, amount) :
with transaction.atomic():
account = (
cls.objects
.select_for_update()
.get(id=id))if account.balance < amount:
raise errors.InsufficentFunds()
account.balance -= amount
account.save()
return account
Copy the code
Follow these steps:
1. We use select_for_UPDATE on our query to tell the database to lock the object until the transaction completes.
Locking a row in a database requires a database transaction – we use Django’s decorator, transaction.atomic(), to define a transaction.
3. We use class methods instead of instance methods – we tell the database to lock and it returns the locked object to us. To do this, we need to fetch objects from the database. If we use self, we are operating on an object that has been retrieved from the database and cannot be guaranteed to be unlocked.
All operations in an account are performed in a database transaction.
Let’s see how we can prevent this with our new approach:
1. User A requires to withdraw $30:
- User A has obtained the lock on the account. - The balance is $100.Copy the code
2. User B requires to deposit $50:
- Failed to obtain the locked account (locked by user A). - User B waits for the lock to be released.Copy the code
3. User A successfully withdrew $30:
- The balance is $70. - The lock is released for user A on the account.Copy the code
4. User B obtains the lock on the account.
- The balance is $70. - The new balance is 70 $+ 50 $= 120 $.Copy the code
5. User B’s lock is released, and the balance is 120 $.
The Bug is gone!
You need to know select_for_update here
In our scenario, where user B waits for user A to release the lock, we can tell Django not to wait for the lock to release and raise DatabaseError. To do this, we can set select_for_update’s nowait parameter to True… Select_for_update (nowait = True).
2, select related object is also locked – when using select_for_UPDATE with select_related, related object is also locked.
Optimistic approach
Unlike the pessimistic approach, the optimistic approach does not require locking objects. The optimistic approach assumes that conflicts are not very common and states that you should only ensure that no changes have been made to the object at update time.
How do we use Django to do something like this?
First, we add a column to track changes made to the object:
version = models.IntegerField(
default=0.)Copy the code
Then, when we update an object, we make sure the version hasn’t changed:
def deposit(self, id, amount) :
updated = Account.objects.filter(
id=self.id,
version=self.version,
).update(
balance=balance + amount,
version=self.version + 1.)return updated > 0
def withdraw(self, id, amount) :
if self.balance < amount:
raise errors.InsufficentFunds()
updated = Account.objects.filter(
id=self.id,
version=self.version,
).update(
balance=balance - amount,
version=self.version + 1.)return updated > 0
Copy the code
And so on…