Every senior Java programmer who experiences multithreaded programming needs to ask himself a question: How is Java’s built-in lock implemented? ReentrantLock is the simplest and most commonly used lock. With ReentrantLock, if the lock is not immediately added, the current thread will block and wait for another thread to release the lock and then try again. How do other threads wake up the current thread after releasing the lock? How did the current thread conclude that it was not locked successfully? This article will answer all of the above questions at their root

The thread blocking primitive

Java thread blocking and awakening are done through the Park and unpark methods of the Unsafe class.

public class Unsafe {...public native void park(boolean isAbsolute, long time);
  public native void unpark(Thread t); . }Copy the code

These two methods are native methods, which themselves are core functions realized by C language. Park means to park and let the currently running Thread thread.currentThread () sleep, and unpark means to unpark and wake up the specified Thread. Both methods are implemented at an underlying level using semaphore mechanisms provided by the operating system. Specific implementation process to study C code, here temporarily not to specific analysis. Two parameters to the park method control how long the sleep lasts. The first parameter isAbsolute indicates whether the second parameter isAbsolute or relative time, in milliseconds.

The thread runs from the moment it starts, and in addition to the operating system’s task scheduling policy, it only suspends when park is called. The secret to a lock suspending a thread is precisely because the lock calls the park method underneath.

parkBlocker

The Thread object Thread has an important property, parkBlocker, which stores what the current Thread is park for. It’s like a lot of cars in a parking lot, and the owners are there for an auction, and they get what they want, and then they drive away. ParkBlocker is the auction. It acts as a manager coordinator for a series of conflicting threads, controlling which threads should sleep and which should wake up.

class Thread {...volatileObject parkBlocker; . }Copy the code

This property is set to null when the thread is woken up by unpark. Park. park and unpark do not set the parkBlocker attribute for us. The utility class that manages this attribute is LockSupport, which simply wraps the Unsafe methods.

class LockSupport {...public static void park(Object blocker) {
     Thread t = Thread.currentThread();
     setBlocker(t, blocker);
     U.park(false.0L);
     setBlocker(t, null); // Wake up with null
  }

  public static void unpark(Thread thread) {
     if(thread ! =null) U.unpark(thread); }}... }Copy the code

Java lock data structures sleep and wake up by calling LockSupport. The value of the parkBlocker field in the thread object is the “queue manager” we’ll talk about below.

Queue manager

When multiple threads are competing for the same lock, there must be a queuing mechanism to chain the threads that failed to acquire the lock together. When the lock is released, the lock manager picks an appropriate thread to take possession of the newly released lock. Inside each lock is a queue manager, which maintains a queue of waiting threads. Already in the queue manager is AbstractQueuedSynchronizer, its internal waiting queue is a list of two-way structure, the structure of each node in the list of below.

class AbstractQueuedSynchronizer {
  volatile Node head;  // The first queue thread will get the lock first
  volatile Node tail;  // The thread that failed to capture the lock is appended to the end of the queue
  volatile int state; / / lock count
}

class Node {
  Node prev;
  Node next;
  Thread thread; // One thread per node
  
  // The following two special fields can be omitted
  Node nextWaiter; // A shared or exclusive lock is requested
  int waitStatus; // Fine state descriptor
}
Copy the code

When the lock fails, the current thread puts itself at the end of the wait list and calls locksupport. park to put itself to sleep. When other threads unlock, they take a node from the head of the linked list and call locksupport. unpark to wake it up.

AbstractQueuedSynchronizer class is an abstract class, it is all the locks the queue manager’s parent, the JDK in various forms of lock their internal queue manager are inherited this class, it is the core of concurrent Java world foundation. Subclasses include ReentrantLock, ReadWriteLock, CountDownLatch, Semaphone, and ThreadPoolExecutor’s internal queue manager. This abstract class exposes some abstract methods, and each lock needs to be customized to the manager. All concurrent data structures built into the JDK are protected by these locks, which are the foundation of the JDK’s multithreaded edifice.

The lock manager maintains an ordinary queue in the form of a two-way list. This data structure is simple, but it is quite complex to maintain carefully because it requires careful consideration of multi-threaded concurrency and every line of code is written with extreme care.

The JDK lock manager was implemented by Douglas S. Lea, a Java developer who wrote almost all of the packages single-handedly. In the algorithmic world, the more sophisticated things are the more suitable for one person to do.

Douglas S. Lea is professor of computer Science and current Chair of the Computer Science Department at the State University of New York at Oswego, specializing in concurrent programming and the design of concurrent data structures. He is an executive committee member of the Java Community Process and chairs JSR 166, which adds concurrency utilities to the Java programming language.

Later we will AbstractQueuedSynchronizer abbreviated as AQS. I must warn readers that THE AQS are so complex that it is normal to encounter setbacks in understanding them. At present, there is no book on the market that can easily understand AQS. Too few people can understand AQS, including myself.

Fair and unfair locks

A fair lock ensures the order in which the lock is requested and acquired. If the lock is in a free state at a certain point and a thread is trying to lock it, a fair lock must also check whether other threads are currently queuing, while a non-fair lock can jump the queue directly. Think of the queue for a burger at KFC.

If a lock is free, you may ask, how can it have queued threads? Let’s say that the thread holding the lock has just released the lock, and it wakes up the first node thread in the waiting queue. The awakened thread has just returned from the park method, and then it tries to lock. The state between the return from park and the lock is the free state of the lock. During this short period of time, other threads may also attempt to lock.

The second thing to note is that a thread that executes the lock. park method does not have to wait until another thread unpark itself to wake up. It can wake up at any time for some unknown reason. Looking at the source code comments, Park returns for four reasons

  1. Other threads unpark the current thread
  2. It’s time to wake up naturally (Park has time parameters)
  3. Other threads interrupt the current thread
  4. Other “false awakenings” due to unknown causes

The documentation does not specify what unknown causes the false wake, but it does state that when the park method returns it does not mean that the lock is free and that the awakened thread will park itself again after a failed attempt to acquire the lock. So the locking process needs to be written in a loop, and there may be several attempts before the lock is successfully obtained.

Unfair locks are more efficient than fair locks in the computer world, so Java default locks use unfair locks. However, in the real world, it seems that the efficiency of unfair locks will be worse. For example, if you can keep cutting in line at KFC, you can imagine that the scene must be chaotic. The difference between the computer world and the real world is probably because in the computer world one thread cutting in line does not cause other threads to complain.

public ReentrantLock(a) {
    this.sync = new NonfairSync();
}

public ReentrantLock(boolean fair) {
    this.sync = fair ? new FairSync() : new NonfairSync();
}
Copy the code

Shared locks and exclusive locks

A ReentrantLock lock is an exclusive lock that is held by one thread while all other threads must wait. A read lock in ReadWriteLock is not an exclusive lock. It allows multiple threads to hold a read lock simultaneously. This is a shared lock. Shared and exclusive locks are distinguished by the nextWaiter field in the Node class.

class AQS {
  static final Node SHARED = new Node();
  static final Node EXCLUSIVE = null;

  boolean isShared(a) {
    return this.nextWaiter == SHARED; }}Copy the code

So why isn’t this field named mode or type or just shared? This is because nextWaiter can be used differently in other scenarios, just like C syndication fields, except that Java doesn’t have syndication types.

Condition variables,

The first question to ask about condition variables is why do you need them? Isn’t locking enough? Consider the following pseudocode to do something when a condition is met

 void doSomething(a) {
   locker.lock();
   while(! condition_is_true()) {// Let's see if we can mess things up
     locker.unlock();  // If you can't do it, take a break and see if you can do it
     sleep(1);
     locker.lock(); // If you can do something, you need to lock it
   }
   justdoit();  / / do things
   locker.unlock();
 }
Copy the code

When the condition is not met, the loop is retried (other threads modify the condition by locking), but the sleep interval is needed otherwise the CPU will skyrockets due to idling. There’s a problem with how much sleep you can’t control. If the interval is too long, the overall efficiency will be slowed down, and even the timing will be missed (the condition will be reset immediately after it is met). If the interval is too short, the CPU will idle again. With conditional variables, this problem can be solved

void doSomethingWithCondition(a) {
  cond = locker.newCondition();
  locker.lock();
  while(! condition_is_true()) { cond.await(); } justdoit(); locker.unlock(); }Copy the code

Await () blocks on the cond condition variable until another thread calls either cond.signal() or cond.signalAll(). Await () blocks automatically releases the lock held by the current thread. Await () will try again to hold the lock (possibly queueing again) after being awaked, and await() method will return successfully after the lock has been obtained.

There can be multiple threads blocking on a condition variable, and these blocking threads are cascaded into a conditional wait queue. When signalAll() is called, it wakes up all the blocking threads, allowing them to start fighting for the lock again. If signal() is called, only the threads at the head of the queue wake up, thus avoiding the “stampede problem”.

The await() method must release the lock immediately, otherwise the critical section state cannot be modified by other threads and condition_is_true() returns the same result. This is why the condition variable must be created by the lock object. The condition variable needs to hold a reference to the lock object so that the lock can be released and re-locked after being woken up by signal. The lock that creates the condition variable must be an exclusive lock. If the shared lock is await(), there is no guarantee that the state of the critical section can be modified by other threads. This is why the newCondition method of the readWritelock. ReadLock class is defined as follows

public Condition newCondition(a) {
    throw new UnsupportedOperationException();
}
Copy the code

With the conditional variable, the problem of sleep’s poor control is solved. When the condition is met, the signal() or signalAll() methods are called, and the blocked thread can be woken up immediately with almost no delay.

Conditional wait queue

When multiple threads await() on the same condition variable, a conditional wait queue is formed. Multiple condition variables can be created for the same lock, and there will be multiple conditional wait queues. This queue is similar to the AQS queue structure, except that instead of a two-way queue, it is a one-way queue. The nodes in the queue are the same class as the nodes in the AQS wait queue, but the Pointers to the nodes are not prev and NEXT, but nextWaiter.

class AQS {...class ConditionObject {
    Node firstWaiter;  // point to the first node
    Node lastWaiter;  // point to the second node
  }
  
  class Node {
    static final int CONDITION = -2;
    static final int SIGNAL = -1;
    Thread thread;  // The thread currently waiting
    Node nextWaiter;  // points to the next conditional wait node
  
    Node prev;
    Node next;
    int waitStatus;  // waitStatus = CONDITION}... }Copy the code

Queue transfer

When the conditional variable’s signal() method is called, the thread of the head node of the conditional queue is woken up, the node is removed from the conditional queue, and then transferred to the AQS waiting queue, ready to queue up to try to reacquire the lock. The state of the node changes from CONDITION to SIGNAL, indicating that the current node was awakened by the CONDITION variable.

class AQS {...boolean transferForSignal(Node node) {
    // Reset the node status
    if(! node.compareAndSetWaitStatus(Node.CONDITION,0))
      return false
    Node p = enq(node); // Enter the AQS waiting queue
    int ws = p.waitStatus;
    // Change the status to SIGNAL
    if (ws > 0| |! p.compareAndSetWaitStatus(ws, Node.SIGNAL)) LockSupport.unpark(node.thread);return true; }... }Copy the code

The meaning of the nextWaiter field of the node to be transferred has also changed. In the conditional queue, it is the pointer to the next node, and in the AQS wait queue, it is the symbol of a shared lock or a mutex.

ReentrantLock Lock process

Next, we analyze the lock process in detail and understand the lock logic control in depth. I have to make sure that the code at Dough Lea is written in a minimalist form like the following, which is hard to read.

class ReentrantLock {...public void lock(a) {
        sync.acquire(1); }... }class Sync extends AQS {...public final void acquire(int arg) {
    if (!tryAcquire(arg) &&
      acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
         selfInterrupt();
  }
  ...
}
Copy the code

The if statement of acquire is divided into three parts. The tryAcquire method indicates that the current thread is trying to acquire a lock. If the lock fails, it will need to queue. The acquireQueued method is then called to start the loop of park, wake up and retry the lock, and park continues after the lock fails. The acquire method does not return until the lock succeeds.

The acquireQueued method returns true if another thread interrupts the loop retry lock. At this point, the thread calls the selfInterrupt() method to set the current thread with an interrupt bit.

// Interrupt the current thread to set an identifier bit
static void selfInterrupt(a) {
        Thread.currentThread().interrupt();
}
Copy the code

How does a thread know that it is interrupted by another thread? This is known by calling Thread.interrupted() after Park wakes up, but this method can only be called once because it clears the interrupt flag immediately after the call. This is why the acquire method calls selfInterrupt() to reset the interrupt flag bit. This allows the upper-level logic to know if it has been interrupted by thread.interrupted ().

The acquireQueued and addWaiter methods are provided by the AQS class, and tryAcquire needs to be implemented by the subclasses themselves. Different locks have different implementations. Let’s look at the implementation of ReentrantLock’s fair lock tryAcquire method

If (c == 0) means that the current lock is free and the count is zero. This is where you need to scramble for locks, as multiple threads may be calling tryAcquire at the same time. The thread that successfully changes the lock count from 0 to 1 will acquire the lock. The current thread is added to the ‘exclusiveOwnerThread’ file.

The code also contains a HasqueuedToraise () judgment, which is very important. It means to see if there are other threads queuing up on the current AQS wait queue. The Fair lock needs to check before it is mounted, and if there is a queue, it cannot cut the queue itself. However, non-fair locks do not need check, and all the implementation differences between fair and unfair locks lie in this. This single check determines whether the lock is fair or not.

Let’s look at the implementation of the addWaiter method. The mode parameter specifies whether the lock is shared or exclusive and corresponds to the Node.nextwaiter property.

Adding a new node to the end of the queue also requires consideration of multi-threaded concurrency, so the code again uses the CAS operation compareAndSetTail to compete for the end pointer. The thread that does not compete continues the next round of contention for(;;) Continue adding new nodes to the end of the queue using the CAS operation.

Let’s look at the code implementation of the acquireQueue method, which repeats the park cycle, tries to lock again, fails to lock and continues the park cycle.

private final boolean parkAndCheckInterrupt(a) {
    LockSupport.park(this);
    return Thread.interrupted();
}
Copy the code

As soon as the park returns and wakes up, the thread checks to see if it has been interrupted by another thread. But even if an interruption occurs, it will continue to try to acquire the lock, and if it does not, it will continue to sleep until the lock is acquired. This means that interrupting the thread does not result in a deadlock exit.

CancelAcquire () cancelAcquire() : The thread is in AQS waiting queue to be locked. When would an exception be thrown that would result in unlocking? The only possibility is the tryAcquire method, which is implemented by subclasses whose behavior is not controlled by AQS. When a subclass’s tryAcquire method throws an exception, the best way to handle AQS is to unlock it. CancelAcquire removes the current node from the wait queue.

ReentrantLock Unlocking process

Unlocking is simpler, with the lock count reduced to zero and the first valid node in the wait queue woken up.

public final boolean release(int arg) {
    if (tryRelease(arg)) {
        Node h = head;
        if(h ! =null&& h.waitStatus ! =0)
            unparkSuccessor(h);
         return true;
     }
     return false;
}

protected final boolean tryRelease(int releases) {
    int c = getState() - releases;
    // The bell must be tied
    if(Thread.currentThread() ! = getExclusiveOwnerThread())throw new IllegalMonitorStateException();
    boolean free = false;
    if (c == 0) {
        free = true;
        setExclusiveOwnerThread(null);
    }
    setState(c);
    return free;
}
Copy the code

Considering reentrant locks, it is necessary to determine whether the lock count is reduced to zero to determine whether the lock is fully released. A subsequent waiting node cannot be woken up until the lock is completely released. The unparksucceeded skips invalid nodes (cancelled ones), finds the first valid one and calls unpark() to wake up the corresponding thread.

Read-write lock

Read/write locks are divided into two lock objects: ReadLock and WriteLock. These two lock objects share the same AQS. The lock count variable state of AQS will be divided into two parts. The first 16 bits are the ReadLock count of the shared lock, and the last 16 bits are the WriteLock count of the mutex lock. A mutex lock records the number of reentries of the current write lock. A shared lock records the total number of reentries of all threads that currently hold the shared read lock.

Read-write locks also need to consider fair and unfair locks. A fair locking strategy for shared and mutex locks, like ReentrantLock, is to see if there are any other threads currently queued and go to the end of the queue. An unfair lock strategy is different in that it tends to give more opportunities to write locks. If there are any read-write requests in the current AQS queue, then the write lock can directly fight, but if the head of the queue is the write lock, then the read lock needs to give the opportunity to the write lock, to queue at the end of the queue. After all, read-write locks are suitable for situations where there are many reads and few writes, and the occasional write lock request should be treated with higher priority.

Write lock lock process

A write lock on a read/write lock is logically the same as a ReentrantLock, except for the tryAcquire() method

public final void acquire(int arg) {
    if(! tryAcquire(arg) && acquireQueued(addWaiter(Node.EXCLUSIVE), arg)) selfInterrupt(); }protected final boolean tryAcquire(int acquires) {
    Thread current = Thread.currentThread();
    int c = getState();
    int w = exclusiveCount(c);
    if(c ! =0) {
         if (w == 0|| current ! = getExclusiveOwnerThread())return false;
         if (w + exclusiveCount(acquires) > MAX_COUNT)
              throw new Error("Maximum lock count exceeded");
         setState(c + acquires);
         return true;
     }
     if(writerShouldBlock() || ! compareAndSetState(c, c + acquires))return false;
     setExclusiveOwnerThread(current);
     return true;
}
Copy the code

Write locks also need to consider reentrant, if the current THREAD holding the AQS mutex is the current thread to lock, then the write lock is reentrant, reentrant only need to increment the lock count value. When c! =0, that is, the lock count is not zero, either because the current AQS has read locks or because there are write locks, to determine w == 0 is to determine whether the current count is caused by the read lock.

If the count is zero, the lock scramble begins. Depending on whether the lock is fair, call writerShouldBlock() before the scramble to see if you need to queue. If not, you can use the CAS operation to scramble, and the thread that successfully sets the count from 0 to 1 will write the lock exclusively.

Read lock lock process

The process of read locking is much more complicated than write locking. It is the same as write locking in terms of overall flow, but there is a big difference in details. In particular, it needs to log read lock counts for each thread, which takes up a lot of code.

public final void acquireShared(int arg) {
    // If attempts to lock fail, queue down and retry
    if (tryAcquireShared(arg) < 0)
        // queue, loop retry
        doAcquireShared(arg);
}
Copy the code

If the current thread already holds a write lock, it can continue to add a read lock, which is the logic that must be supported in order to achieve lock degradation. Lock degradation refers to adding read locks and unlocking write locks when a write lock is held. Compared with write unlock and then read lock, this method can save the second queuing process of lock. Because of the lock degradation, the read and write count in the lock count can not be zero at the same time.

wlock.lock();
if(whatever) {
  / / demote
  rlock.lock();
  wlock.unlock();
  doRead();
  rlock.unlock();
} else {
  / / not downgrade
  doWrite()
  wlock.unlock();
}
Copy the code

To count locks for each read-lock thread, it sets a ThreadLocal variable.

private transient ThreadLocalHoldCounter readHolds;

static final class HoldCounter {
    int count;
    final long tid = LockSupport.getThreadId(Thread.currentThread());
}

static final class ThreadLocalHoldCounter
            extends ThreadLocal<HoldCounter> {
   public HoldCounter initialValue() {
        returnnew HoldCounter(); }}Copy the code

However, ThreadLocal variables are not accessed efficiently enough, so caching is set up again. It stores the lock count of the thread that last acquired the read lock. In cases where thread contention is not particularly frequent, it is more efficient to read the cache directly.

private transient HoldCounter cachedHoldCounter;
Copy the code

Dough Lea decided that using cachedHoldCounter wasn’t efficient enough, so he added a new cache record, firstReader, to record the first thread that changed the read lock count from 0 to 1 and the lock count. When there is no thread contention, it is more efficient to read these two fields directly.

private transient Thread firstReader;
private transient int firstReaderHoldCount;

final int getReadHoldCount(a) {
    // First access the read count part of the lock global count
    if (getReadLockCount() == 0)
        return 0;

    // Access firstReader again
    Thread current = Thread.currentThread();
    if (firstReader == current)
         return firstReaderHoldCount;

    // Reaccesses the count of the most recent read-thread lock
    HoldCounter rh = cachedHoldCounter;
    if(rh ! =null && rh.tid == LockSupport.getThreadId(current))
        return rh.count;

    // Read ThreadLocal
    int count = readHolds.get().count;
    if (count == 0) readHolds.remove();
    return count;
}
Copy the code

So we see that the author took great pains to keep track of this read lock count, but what does this read count do? The thread can use this count to determine whether or not it holds the read/write lock.

There is also a spin process, the so-called spin is the first failed lock, then retry directly, not sleep, sounds a bit like an endless loop retry method.

final static int SHARED_UNIT = 65536
// The read count is 16 bits high

final int fullTryAcquireShared(Thread current) {
  for(;;) {
    int c = getState();
    // If another thread adds a write lock, go back to sleep
    if(exclusiveCount(c) ! =0) {
        if(getExclusiveOwnerThread() ! = current)return -1; .// The count limit is exceeded
    if (sharedCount(c) == MAX_COUNT)
       throw new Error("Maximum lock count exceeded");
    if (compareAndSetState(c, c + SHARED_UNIT)) {
       // Get read lock.return 1}...// Loop retry}}Copy the code

Because the read lock requires CAS operation to modify the total read count value of the underlying lock, the successful one can obtain the read lock. The failure of CAS operation to obtain the read lock only means that CAS operation competition exists between read locks, but it does not mean that the lock cannot be obtained by itself. Try a few times and you’ll get the lock, and that’s where the spin comes in. There is also a cyclic retry of the CAS operation when the read lock is released.

protected final boolean tryReleaseShared(int unused) {...for (;;) {
       int c = getState();
       int nextc = c - SHARED_UNIT;
       if (compareAndSetState(c, nextc)) {
         return nextc == 0; }}... }Copy the code

summary

The foundation of Java concurrency is the park() and unpark() methods, volatile variables, synchronized, CAS operations, and AQS queues. There are a lot of details that I haven’t quite figured out myself, so we’ll talk more about locks later.