With the development of business and the increase of users, the problem of high concurrency often becomes a very difficult problem that programmers have to face and deal with. Concurrent programming is relatively advanced and obscure knowledge in the field of programming. It is not so easy to write good concurrent programs if you want to learn the knowledge related to concurrency. Programmers who write Java may be happier at this point, as the abundance of packaged synchronization primitives and master written synchronization utility classes in Java lower the bar for writing correct and efficient concurrent programs. Although this high degree of encapsulation abstraction simplifies the writing of the program, it has certain obstacles for us to understand its internal implementation mechanism. Now let’s make an analogy from the point of view of the lock in the real world, and see what kind of existence the lock in the program world is.

Locks in the program world

If someone asks you, “How do I keep my house from strangers?” I think it’s easy for you to think, “Just lock it!” . And if someone asks you, “How do I handle concurrency on multiple threads?” I think you might blurt out, “Just lock it!” . Similar scenarios are easy to understand in the real world, but in the world of programming, these words are full of confusion. We’ve seen all kinds of locks in the real world, but what does a lock look like in Java? In the real world, we usually need a key to open the lock and enter the house. What is the key to open the lock in the program world? In the real world, a lock is usually on a door or a cabinet or something, but in the world of programming, where do locks exist? In the real world, it’s usually us who lock and unlock, but who locks and unlocks in the programming world?

With these questions in mind, we would like to take a closer look at what kind of locks exist in Java. Where to start to understand, I think the lock in the program is used to use first, then from the use of the lock began to investigate it!

The use of the lock

When it comes to locking in Java, there are generally two classes: Synchronized, the jVM-level concurrency synchronization primitive, and the several classes that implement the Java API level Lock interface. Java API level lock such as Reentrantlock and ReentrantReadWriteLock these exist very detailed source code, you can go to see how they are implemented, perhaps you can find the above answer, here we look at Synchronized.

Let’s start with the following code:

public class LockTest
{
    Object obj=new Object();
    public static synchronized void testMethod1(a)
    {
        // Synchronize the code.
    }
    public synchronized void testMethod2(a)
    {
        // Synchronize the code
    }
    public void testMethod3(a)
    {
        synchronized (obj)
        {
            // Synchronize the code}}}Copy the code

Many concurrent programming books summarize the use of Synchronized as follows:

  • Synchronized modifies static methods (corresponding to testMethod1) and locks the current class’s class object, which in this case is lockTest.classobject.
  • When Synchronized modifies instance methods (corresponding to testMethod2), it locks the object of the current class instance, which corresponds to the this reference in LocKTestobject.
  • When Synchronized modifies a Synchronized block (corresponding to testMethod3), it locks the object instance in the bracket of the Synchronized block, which in this case is objobject.

From this we can see that the use of Synchronized depends on specific objects, and from this we can see that there is some association between locks and objects. So the next step is to see if there are any clues about the lock in the object.

Object composition

Everything is an object in Java, just like your object has long hair, big eyes (maybe it’s just imagination)… Objects in Java consist of three parts. These are object headers, instance data, and alignment padding.

Instance data is the space taken up by the field data we define in the class. The reason for aligned padding is that java-specific virtual machines require objects to be multiples of 8 bytes. If an object lock ends up taking up less than 8 bytes of storage, it should be filled to 8 bytes. It doesn’t look like the lock has much to do with either region, so the lock should have some relationship to the object header, as shown below:

Let’s look at the contents of the object header:

The length of the content instructions
32/64bit Mark Word Store HashCode or lock information for objects
32/64bit Class Metadata Address A pointer to data stored in an object type
32/64bit Array length The length of the array if the current object is an array

Take a 32-bit virtual machine (64-bit analogy). The Mark Word is only four bytes long and contains HashCode and other information. This statement was completely untrue before Jdk1.6 and partially true after Jdk1.6.

Why do you say that?

This is because in Java thread is one-to-one with the local operating system thread, and internal security operating system in order to protect the system, to prevent some internal instructions such as random calls, to ensure the safety of the kernel of the system space is divided into the user mode and kernel mode, we usually running thread is running in user mode, When we need to call operating system services (called system calls here), such as read and writer, there is no way to directly initiate calls in user mode, so we need to switch between user mode and kernel mode. The reason why Synchronized was called heavyweight lock in the early stage was that the locking and unlocking of Synchronized required the switch between user state and kernel state. Therefore, Synchronized was heavyweight lock in the early stage, which needed to realize the blocking and awakening of thread, and the exit and entry of blocking queue and conditional queue, etc. We’ll talk about that later, but obviously you can’t store it in those four bytes. But Jdk1.6 made a number of optimizations to Synchronized, including lock upgrades, that made this statement partially true.

Lock the upgrade process

The above statement is partially true because Synchronized was optimized by the virtual machine team in Jdk1.6 in a way that we won’t discuss, and is well documented in many concurrent programming books. One of the most important optimizations we’ll talk about here is the lock upgrade.

Synchronized locks in Java are upgraded as follows: lock free > bias lock > lightweight lock > heavyweight mutex.

That said, Synchronized won’t use the pre-JDK1.6 heavy mutex unless there is a serious multi-threaded lock contention.

We know that in the real world we are responsible for locking and unlocking, but in the program world it is actually the thread that acts as the human to lock and unlock.

Biased locking

At the beginning, the lock state is unlocked, which we can think of as the door of the treasure house is unlocked. At this time, the first thread runs to the synchronization code area (the first person walks to the door) and adds a bias lock. At this time, what is the lock state? At this time, it is actually similar to the shape of a face recognition lock. The first thread entering the synchronization code block itself serves as the key, and the thread ID that can uniquely identify a thread is saved in Mark Word.

The Mark Word at this time reads as follows:

Here, 23 bits of the four bytes are used to store the thread ID of the first thread that acquired the biased lock. The 2-bit Epoch represents the validity of the biased lock, the 4-bit age of the object, 1 bit whether it is a biased lock (1 indicates yes), and the 2-bit lock flag bit (01 indicates a biased lock).

When the first thread runs to the Synchronized block, it checks the header of the object used by the Synchronized lock. If the thread ID of the header of one of the three types of Synchronized objects is null and the bias lock is valid, Suggests the current still is in a state of no lock (i.e., treasure house also unlocked), so this time the first thread will use own thread ID in CAS of replacement head Mark Word thread ID to the object, if replace success indicated that the thread for the biased locking, then thread can safely perform synchronization code, In the future, if the thread enters the synchronization code again, during this period, if other threads do not obtain the bias lock, it simply needs to compare its thread ID with the thread ID in Mark Word whether it is consistent, then it can directly enter the synchronization code area, so that the performance loss is much smaller.

Biased locking is based on the fact that the HotSpot team did a study and showed that in most cases locks are not contested and are always acquired multiple times by the same thread. In this case, introducing biased locking can be a great benefit!

If, on the other hand, this is not very common, meaning that the lock is heavily contested, or that the lock is usually acquired by multiple threads in turn, then favoring the lock is of little use.

Lightweight lock

From this, we can see what kind of form the partial lock exists at the beginning. We also said that the partial lock exists when there is no multiple threads competing for the lock. However, in a high concurrency environment, the lock competition is inevitable, and Synchronized starts his road to promotion.

When there are multiple threads competing for the lock, the simple biased lock is not so secure, and the lock cannot be locked, so the lock must be changed and upgraded to a more secure lock. At this point, the lock upgrade process can be roughly divided into two steps :(1) the cancellation of biased locks (2) the upgrade of lightweight locks.

First of all, how to cancel biased lock? We say that biased lock lock is actually the thread ID in Mark Work, at this time, as long as the Mark Word is changed, it automatically equals to cancel biased lock. Then the problem is that biased lock is represented by thread ID, and what should lightweight lock be represented by? The answer is Lock Record.

Let me explain:

We know that the JVM memory structure can be divided into (1) heap (2) virtual machine stack (3) local method stack (4) program counter (5) method area (6) direct memory. The program counter and the virtual machine stack are thread private. Each thread has its own stack space, which seems to be a good way to tell which thread acquired the lock. In fact, the JVM does this.

First, the JVM creates a block of memory in the current stack, called a Lock Record, and copies the Mark Word contents into the Lock Record. So why do you want to save the old stuff? I’m going to change the content of Mark Word. I’m going to change the content of Mark Word. I’m going to change the content of Mark Word. Of course, Mark Word is replaced by CAS! The Mark Word will now say the following:

As you can see, Mark Word uses 30 bits to Record the Lock Record we just created in the frame. The Lock flag bit is 00 for the lightweight Lock, so it is easy to know which thread acquired the lightweight Lock.

Lightweight lock is based on such a fact, when there are two or more threads competing lock, the vast majority of cases, the thread holding the lock is will soon release the lock, that is when the lock is a small amount of competition, lock is normally hold time is very short, now waiting for locks the thread can be user mode and kernel mode don’t have to switch to block their, As long as the loop is empty (this is called spin) for a while, the thread holding the lock can be expected to release the lock immediately during the spin.

Lightweight locks are obviously suitable for situations where the lock is not contested and is held for a short period of time, whereas if the lock is contested or a thread does not release the lock for a long time after it has acquired the lock, the thread can waste CPU resources by spinning in vain (an infinite loop).

Heavyweight mutex

When there are too many people trying to get into the treasure house, and lightweight is not enough, the only thing that can be used is the heavyweight mutex. This was also the default implementation of Synchronized prior to Jdk1.6.

When the lock is a lightweight lock, the thread needs to spin to wait for the thread holding the lock to release the lock, and then apply for the lock, but there are two problems:

  1. There are a lot of spinning threads, that is, a lot of threads are waiting for the thread that currently holds the lock to release the lock. Because the lock can only be acquired by one thread at a time (as far as Synchronized is concerned), a large number of threads fail to acquire the lock, so they cannot always spin down, right?
  2. If the thread holding the lock does not release the lock for a long time, the thread waiting to acquire the lock will still not spin for a long time.

In each case, the thread waiting to acquire the lock is miserable, and in both cases, it is even worse if the lock is hotly contested and the thread holding the lock does not release the lock for a long time. The JVM then sets a spin limit, and if the thread spins a certain number of times and still does not acquire the lock, it is considered a lock contention situation, and the thread requests that the lightweight lock be revoked and promoted to a heavyweight mutex.

In the case of lightweight locks, the Lock exists in the form of a Lock Record, but in the case of heavyweight locks, what form should it exist?

The complexity of heavyweight locks is the highest, because the thread holding the lock needs to wake up the blocking waiting thread when releasing the lock, and if the thread cannot obtain the lock, it needs to enter a blocking area to block the waiting uniformly. Meanwhile, we know that wait and wake up of wait and notify conditions need to be dealt with. So the implementation of heavyweight locks requires an additional killer, the Monitor.

The art of Concurrent Programming in Java is described as follows:

The JVM implements method synchronization and code block synchronization based on entering and exiting Monitor objects, but the implementation details are different. Code block synchronization is implemented using monitorenter and Monitorexit directives, while method synchronization is implemented in a different way, the details of which are not specified in the JVM specification. However, synchronization of methods can also be achieved using these two instructions.

The MonitoRenter directive inserts at the start of the synchronized code block after compilation, while the Monitorexit inserts at the end of the method and at the exception. The JVM ensures that each Monitorenter must have a Monitorexit paired with it. Any object has a Monitor associated with it, and when a Monitor is held, it is locked. When a thread executes a Monitorenter instruction, it attempts to acquire ownership of the object’s monitor, that is, the lock on the object.

For example, the HotSpot virtual machine is implemented in C++, which is also an object-oriented language. Therefore, the virtual machine design team chose to represent the lock in the form of an object, and C++ also supports polymorphism. ObjectMonitor is used to implement Monitor in VMS. The relationship between Monitor and ObjectMonitor can be similar to the relationship between Map and HashMap in Java.

Let’s take a look at ObjectMonitor in action:

  ObjectMonitor() 
  {
    _header       = NULL;
    _count        = 0;// Count the number of times the thread acquires the lock
    _waiters      = 0,
    _recursions   = 0;// The number of lock reentrant times
    _object       = NULL;
    _owner        = NULL;// points to the thread holding ObjectMonitor
    _WaitSet      = NULL;// Hold a collection of threads in Wait state
    _WaitSetLock  = 0 ;
    _Responsible  = NULL ;
    _succ         = NULL ;
    _cxq          = NULL ;
    FreeNext      = NULL ;
    _EntryList    = NULL ;// So the collection of threads that are blocked while waiting to acquire the lock
    _SpinFreq     = 0 ;
    _SpinClock    = 0 ;
    OwnerIsThread = 0 ;
  }
Copy the code

It is highly recommended that you take a look at the source code for ReentrantLock based on the AQS(Abstract queue synchronizer) implementation, because the synchronizer implementation idea inside ReentrantLock is basically a microcosm of Monitor in Synchronized implementation.

When a thread obtains the lock, objectMonitor.Enter () will be called to enter the synchronization code block. After obtaining the lock, the owner will be set to point to the current thread. When another thread attempts to acquire a lock, find the owner in ObjectMonitor and check if it is himself. If so, recursions and count increase by 1, indicating that the thread acquires a lock again. Otherwise it should be blocked, so where do those blocked threads go? Place it in EntryList. When a thread that holds a lock calls wait (we know that a wait method causes the thread to abandon the CPU, release its own lock, and then block and suspend itself until another thread calls notify or notifyAll), the thread should release the lock, set owner to null, and set owner to null. And wakes up the EntryList thread that is blocking and waiting to acquire the lock. Then it suspends itself and waits in waitSet. When the notify or notifyAll method is called by another thread holding the lock, A notify or notifyAll thread in WaitSet is moved from WaitSet to EntryList to wait for a competing lock. When the thread wants to release the lock, objectMonitor.exit () is called to exit the synchronized code block. Combined with the description in The Art of Concurrent Programming in Java, everything is clear.

Upgrading a lock to a heavyweight lock also requires two steps :(1) lifting a lightweight lock and (2) upgrading a heavyweight lock.

To undo the lightweight Lock, of course, write the contents stored in the Lock Record stored in the frame back to Mark Work, and then clean the Lock Record from the frame. You then need to create an ObjectMonitor object and save the contents of the Mark Word to ObjectMonitor (to restore the Mark Word when the lock is revoked, as it is stored in ObjectMonitor). So how do you find this ObjectMonitor object? Mark Word is a pointer to an ObjectMonitor object. How to modify and replace the content in Mark Word? Of course I know CAS!

Locked in the form of a heavyweight mutex the Mark Word reads as follows:

You can see that the Mark Word uses 30 bits to hold a pointer to ObjectMonitor, and the lock marker bit is 10, indicating a heavyweight lock.

Heavyweight locks are based on the fact that when a lock is in serious contention, or when the lock is held for a long time, the thread waiting to acquire the lock should block and suspend itself, waiting for the thread to wake up when it releases the lock, to avoid wasting CPU resources in vain.

Change of lock form

Now we can answer the question “What does a lock look like in Java?” In different lock states, the lock behaves differently.

When the lock exists as a partial lock, the lock is the Thread ID in Mark Word, and the Thread itself is the key to open the lock. The Thread whose “ID card” is stored in Mark Word will get the lock.

When the Lock exists as a lightweight Lock, the Lock is the Lock Record in the stack frame pointed to in Mark Word. At this time, the key is the site, which is the virtual machine stack. Whoever has the Lock Record in the stack will get the Lock.

When a lock exists as a heavyweight lock, the lock is ObjectMonitor, the implementation of Monitor in C++. In this case, the key is the owner of ObjectMonitor. Owner points to whoever gets the lock.

In the previous question, we said that the 32-bit virtual machine Mark Word only has four bytes, so the lock can be completely within these four bytes. This statement was completely untrue before Jdk1.6 and partially true after Jdk1.6. Do you have a better understanding of this sentence now?

In the real world, we are the ones who lock and unlock. In the programming world, who is the one who locks and unlocks? Yes, threads.

Now it’s easy to look back at the questions that started with Synchronized!

About the CAS

Although Synchronized has gone through a series of optimizations and has much better performance than the original, businesses are increasingly pursuing low latency and high responsiveness, and CAS concurrency control represented by optimistic concurrency control is becoming more and more popular. It can be seen that CAS does have a good application effect in non-blocking atomic replacement. Interestingly, according to the previous knowledge, CAS is used in a large number of non-blocking modification and replacement of Mark Word in the upgrade process of Synchronized, which is worth learning in many aspects.


Thank you for your patience to see here, I hope this article can bring you some help in the study of lock!

If you feel good, please give a thumbs-up. Your support and encouragement are the driving force of my creation!