In this article, we’ll focus on deadlocks and their solutions.

An overview of the

In the last article, we discussed how to use a mutex to protect multiple resources, using bank account transfers as an example, and the solution was to create a mutex based on a Class object.

This solves the synchronization problem, but can it be used in real life? And the answer is no, especially in the case of high concurrency, the reason is we use the mutex is too large, the range of transfer, for example, we will lock the account Class object, which can lead to transfer operation can only be a serial, but in the actual scene, a large number of transfer operation business of both sides is not the same, Locking directly at the Class object level is not acceptable.

So what’s the problem with fine-grained locking at the object instance level? Deadlocks can occur.

Let’s look at the causes of deadlocks and possible solutions.

Deadlock case

What is a deadlock?

A deadlock is when a group of threads competing for resources are blocked “permanently” by waiting for each other.

In general, when fine-grained locking is used, it can lead to deadlocks as well as performance improvements.

Let’s take a bank transfer as an example to see how a deadlock can occur.

First, let’s define a BankAccount object to store basic information. The code is as follows.

public class BankAccount { private int id; private double balance; private String password; public int getId() { return id; } public void setId(int id) { this.id = id; } public double getBalance() { return balance; } public void setBalance(double balance) { this.balance = balance; }}Copy the code

Next, we use fine-grained locks to try to complete the transfer operation, as shown below.

public class BankTransferDemo {
	
	public void transfer(BankAccount sourceAccount, BankAccount targetAccount, double amount) {
		synchronized(sourceAccount) {
			synchronized(targetAccount) {
				if (sourceAccount.getBalance() > amount) {
					System.out.println("Start transfer.");
					System.out.println(String.format("Before transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance()));
					sourceAccount.setBalance(sourceAccount.getBalance() - amount);
					targetAccount.setBalance(targetAccount.getBalance() + amount);
					System.out.println(String.format("After transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance()));
				}
			}
		}
	}
}
Copy the code

Let’s do a simple test with the following code.

public static void main(String[] args) throws InterruptedException { BankAccount sourceAccount = new BankAccount(); sourceAccount.setId(1); sourceAccount.setBalance(50000); BankAccount targetAccount = new BankAccount(); targetAccount.setId(2); targetAccount.setBalance(20000); BankTransferDemo obj = new BankTransferDemo(); Thread t1 = new Thread(() ->{ for (int i = 0; i < 10000; i++) { obj.transfer(sourceAccount, targetAccount, 1); }}); Thread t2 = new Thread(() ->{ for (int i = 0; i < 10000; i++) { obj.transfer(targetAccount, sourceAccount, 1); }}); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println("Finished."); }Copy the code

The test code contains two threads, where the T1 thread circulates from sourceAccount to targetAccount and the T2 thread circulates from targetAccount to sourceAccount.

According to the running result, when the loop in T1 thread runs about 600 times, t2 thread is also created and begins to cycle transfer. At this time, deadlock will occur, causing t1 thread and T2 thread can not continue to execute.

Deadlocks can be more visually described using the resource allocation diagram below.

Deadlock causes and prevention

Once a concurrent program is deadlocked, there is usually no good way, many times we can only restart the application, therefore, the best way to solve the deadlock problem is to avoid the deadlock.

Let’s start by looking at the conditions under which Deadlocks can occur. In 1971, a brilliant guy named Coffman published an article in ACM Computing Surveys called System Deadlocks, which concluded that Deadlocks can only happen if all four conditions are met:

  • Mutually exclusive. Shared resources X and Y can only be occupied by one thread.
  • Thread T1 has acquired shared resource X and does not release shared resource X while waiting for shared resource Y.
  • No preemption. Other threads cannot forcibly preempt resources held by thread T1.
  • A circular wait, in which thread T1 waits for resources held by thread T2, and thread T2 waits for resources held by thread T1, is called a circular wait.

From the above description, we can deduce that deadlocks can be avoided by breaking just one of the above conditions.

But the first condition is mutually exclusive and can’t be broken, otherwise we won’t need locks, so let’s see how we can break the other three conditions.

Break the occupation and wait condition

If we want to break the hold and wait condition, we can try to apply for all resources at once so that there is no need to wait.

During the implementation, we need to create a new role that is responsible for both requesting and releasing all resources at the same time, which we can call Allocator.

Let’s look at the actual code implementation.

public class Allocator { private volatile static Allocator instance; private Allocator() {} public static Allocator getInstance() { if (instance == null) { synchronized(Allocator.class) { if (instance == null) { instance = new Allocator(); } } } return instance; } private Set<Object> lockObjs = new HashSet<Object>(); public synchronized boolean apply(Object... objs) { for (Object obj : objs) { if (lockObjs.contains(obj)) { return false; } } for (Object obj : objs) { lockObjs.add(obj); } return true; } public synchronized void free(Object... objs) { for (Object obj : objs) { if (lockObjs.contains(obj)) { lockObjs.remove(obj); }}}}Copy the code

Allocator is a singleton pattern that uses a Set object to hold all resources that need to be processed, and then uses apply() and free() to lock or release all resources at the same time, which receive unfixed arguments.

Let’s look at how the new transfer() method should be written.

	public void transfer(BankAccount sourceAccount, BankAccount targetAccount, double amount) {
		Allocator allocator = Allocator.getInstance();
		while(!allocator.apply(sourceAccount, targetAccount));
		try {
			synchronized(sourceAccount) {
				synchronized(targetAccount) {
					if (sourceAccount.getBalance() > amount) {
						System.out.println("Start transfer.");
						System.out.println(String.format("Before transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance()));
						sourceAccount.setBalance(sourceAccount.getBalance() - amount);
						targetAccount.setBalance(targetAccount.getBalance() + amount);
						System.out.println(String.format("After transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance()));
					}
				}
			}
		}
		finally {
			allocator.free(sourceAccount, targetAccount);
		}
	}
Copy the code

As you can see, the Transfer () method takes the Allocator instance, then calls apply(), passing in the sourceAccount and targetAccount instances. Note that the while loop is used, i.e. until apply() returns true, At this point, Allocator has locked the sourceAccount and targetAccount. Next, we use the synchronized keyword to lock the sourceAccount and targetAccount, and then execute the transfer logic. Synchronized is not necessary, but it prevents other operations from affecting the transfer, such as the concurrency issues that can arise if money is withdrawn from the sourceAccount instance during the transfer.

Here is the test code.

public static void main(String[] args) throws InterruptedException { BankAccount sourceAccount = new BankAccount(); sourceAccount.setId(1); sourceAccount.setBalance(50000); BankAccount targetAccount = new BankAccount(); targetAccount.setId(2); targetAccount.setBalance(20000); BankTransferDemo obj = new BankTransferDemo(); Thread t1 = new Thread(() ->{ for (int i = 0; i < 10000; i++) { obj.transfer(sourceAccount, targetAccount, 1); }}); Thread t2 = new Thread(() ->{ for (int i = 0; i < 10000; i++) { obj.transfer(targetAccount, sourceAccount, 1); }}); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println("Finished."); }Copy the code

The program worked and the results were as we expected.

Here, we need to ensure the immutability of the lock object. For the BankAccount object, the ID attribute can be regarded as its primary key. From the perspective of business, BankAccount instances with the same ID point to the same account. This leads to concurrency problems.

Let’s look at the modified test code below.

public static void main(String[] args) throws InterruptedException { BankTransferDemo obj = new BankTransferDemo(); Thread t1 = new Thread(() ->{ for (int i = 0; i < 10000; I++) {// the account instance should be fetched from the back end, for demonstration purposes only. BankAccount sourceAccount = new BankAccount(); sourceAccount.setId(1); sourceAccount.setBalance(50000); BankAccount targetAccount = new BankAccount(); targetAccount.setId(2); targetAccount.setBalance(20000); obj.transfer(sourceAccount, targetAccount, 1); }}); Thread t2 = new Thread(() ->{ for (int i = 0; i < 10000; I++) {// the account instance should be fetched from the back end, for demonstration purposes only. BankAccount sourceAccount = new BankAccount(); sourceAccount.setId(1); sourceAccount.setBalance(50000); BankAccount targetAccount = new BankAccount(); targetAccount.setId(2); targetAccount.setBalance(20000); obj.transfer(targetAccount, sourceAccount, 1); }}); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println("Finished."); }Copy the code

In the above code, a new instance of BankAccount is created for each transfer and passed to Allocator. This does not work because the mutex used each time applies to a different instance.

Break the non-preemption condition

Breaking the non-preemption condition is simple, and the key to the solution is to be able to actively release the resources it occupies, but synchronized cannot do this.

Synchronized When applying for resources, if the application fails, the thread is blocked and cannot do anything, and the locked resources cannot be released.

We can do this using Lock objects in the java.util.concurrent package, as shown below.

private Lock lock = new ReentrantLock(); public void transfer(BankAccount sourceAccount, BankAccount targetAccount, double amount) { try { lock.lock(); if (sourceAccount.getBalance() > amount) { System.out.println("Start transfer."); System.out.println(String.format("Before transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance())); sourceAccount.setBalance(sourceAccount.getBalance() - amount); targetAccount.setBalance(targetAccount.getBalance() + amount); System.out.println(String.format("After transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance())); } } finally { lock.unlock(); }}Copy the code

Failure cycle condition

To break a loop condition, you need to sort the resources and then apply for them in order.

Let’s look at the following code.

public void transfer(BankAccount sourceAccount, BankAccount targetAccount, double amount) { BankAccount left = sourceAccount; BankAccount right = targetAccount; if (sourceAccount.getId() > targetAccount.getId()) { left = targetAccount; right = sourceAccount; } synchronized(left) { synchronized(right) { if (sourceAccount.getBalance() > amount) { System.out.println("Start transfer."); System.out.println(String.format("Before transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance())); sourceAccount.setBalance(sourceAccount.getBalance() - amount); targetAccount.setBalance(targetAccount.getBalance() + amount); System.out.println(String.format("After transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance())); }}}}Copy the code

In this case, we assume that the ID in BankAccount is the primary key, we sort the sourceAccount and targetAccount by ID, and then apply for resources by ID from smallest to largest, so that no deadlocks occur.

There may be several ways to solve a concurrency problem, and we need to evaluate each solution and choose the one with the lowest cost.

Breaking the loop condition might be a good solution for the transfer example we’ve been talking about.

Use a wait-notification mechanism

We used the following loop above to break the occupation and wait condition:

while(! allocator.apply(sourceAccount, targetAccount));Copy the code

In low concurrency cases, this is fine, but in high concurrency cases, this may require too many cycles to get the lock, which is cpu-intensive and can be considered a brute force type.

In this case, a reasonable solution is to block itself if the condition is not met and enter the wait state. When the condition is met, the waiting thread is told to execute again. In this case, the thread blocking avoids the problem of loop CPU consumption.

This is the wait-notification mechanism we’re talking about.

Wait-notification mechanism in Java

What is the wait-notification mechanism flow in Java?

The thread first obtains the mutex. When the conditions required by the thread are not met, the mutex is released and the thread enters the waiting state. When the required condition is met, the waiting thread is notified to reacquire the mutex.

Java uses synchronized keyword and wait(), notify(), and notifyAll() to implement the wait-notification mechanism.

In concurrent programs, when a thread enters a critical section, it needs to enter a wait state because some condition is not met. The Wait () method of Java objects enables this. The notify() and notifyAll() methods of Java objects can notify the waiting thread when the desired condition is met. They tell the thread that the condition has been met, because notify() can only guarantee that the condition is met at the moment of notification. However, the execution time of the notified thread does not coincide with the notification time, so the condition may not be satisfied when the thread starts to execute.

Also note that when the notified thread executes again, it also needs to acquire the mutex, since the mutex was released when the wait() method was called earlier.

Wait (), notify(), and notifyAll() can be called only if the response mutex has been obtained, so all three methods are called inside synchronized{}.

Let’s take a look at the modified Allocator with the code for the apply() and free() methods as follows.

	public synchronized void apply(Object... objs) {
		for (Object obj : objs) {
			while (lockObjs.contains(obj)) {
				try {
					this.wait();
				} catch (InterruptedException e) {
					System.out.println(e.getMessage());
				}
			}
		}
		for (Object obj : objs) {
			lockObjs.add(obj);
		}
	}
	
	public synchronized void free(Object... objs) {
		for (Object obj : objs) {
			if (lockObjs.contains(obj)) {
				lockObjs.remove(obj);
			}
		}
		this.notifyAll();
	}
Copy the code

The code for the corresponding Transfer () method is shown below.

public void transfer(BankAccount sourceAccount, BankAccount targetAccount, double amount) { Allocator allocator = Allocator.getInstance(); allocator.apply(sourceAccount, targetAccount); try { synchronized(sourceAccount) { synchronized(targetAccount) { if (sourceAccount.getBalance() > amount) { System.out.println("Start transfer."); System.out.println(String.format("Before transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance())); sourceAccount.setBalance(sourceAccount.getBalance() - amount); targetAccount.setBalance(targetAccount.getBalance() + amount); System.out.println(String.format("After transfer, source balance:%s, target balance:%s", sourceAccount.getBalance(), targetAccount.getBalance())); } } } } finally { allocator.free(sourceAccount, targetAccount); }}Copy the code

The results are exactly what we expected.

Conditions have been met

In the code above, we can see that the criteria in the apply() method, which was previously if, are now changed to while, while (lockobjs.contains (obj)), which solves the problem that the criteria used to be satisfied.

Because by the time wait() returns, it is possible that the condition has changed so that it was satisfied but is no longer so that the condition is rechecked.

It’s a paradigm, it’s a classic approach.

notify() vs notifyAll()

What’s the difference between notify() and notifyAll()?

Notify () randomly notifies one thread in the wait queue, and notifyAll() notifies all threads in the wait queue.

We try to use notifyAll() because notify() can cause some threads to never be notified.

Suppose we have an instance that has resources A, B, C, and D, and we use the instance object to create the mutex.

  • Thread T1 has applied for A and B
  • Thread T2 applies to C and D
  • Thread T3 attempts to apply for A and B, fails, and enters the wait queue
  • Thread T4 attempts to apply for C and D, fails, and enters the waiting queue
  • At this point, thread T1 completes execution and releases the lock
  • Thread T1 calls notify() of the instance to notify the thread in the waiting queue. Thread T4 may be notified, but thread T4 applies for C and D and is still occupied by thread T2, so thread T4 can only wait
  • At this point, thread T2 completes execution and releases the lock
  • Thread T2 calls notify() of the instance to notify the threads in the waiting queue that only one of T3 or T4 can be woken up and executed properly, and the other one will never be woken up again

The difference between wait() and sleep()

The wait() method differs from the sleep() method in that the wait() method releases the object’s “lock flag.” When the wait() method of an object is called, the current thread is suspended and put into the wait pool until notify() is called, and any thread is removed from the wait pool and put into the lock flag wait pool. Only threads in the lock flag wait pool can obtain the lock flag. They are ready to fight for ownership of the lock. When the notifyAll() method of an object is called, all threads in the object’s wait pool are moved to the lock flag wait pool of that object.

The sleep() method, which requires a specified wait time, allows the currently executing thread to pause and block for a specified period of time. This method allows other threads of the same or higher priority to execute, as well as lower priority threads. But the sleep() method does not release the “lock flag,” which means that if a synchronized block is present, other threads still cannot access the shared data.

To summarize, the differences between wait() and sleep() are as follows.

  • Wait () releases resources, sleep() does not release resources
  • Wait () needs to be awakened, sleep() does not
  • Wait () is the method of object’s top-level parent, and sleep() is the method of Thread

Both wait() and sleep() allow CPU execution time to be scheduled again!