This article is part of the Java Concurrent Programming series. If you want to learn more, check out the Java Concurrent Programming Overview.

preface

In the last article we looked at the keyword volatile, which we learned is a lightweight synchronization mechanism, and now we’re talking about our synchronization cousin synchronized. Synchronized, as a common synchronization mechanism in development, is also our common approach to thread safety. I believe you are familiar with it. However, you may not know much about its internal principle and the underlying code implementation. I will thoroughly understand the use of synchronized and the underlying principle with you below.

Thread safety issues

Thread-safe definition: A class is thread-safe when it is accessed by multiple threads, regardless of how the runtime environment is scheduled or how the threads will execute interchangeably, and when the class behaves correctly without any additional synchronization or coordination in the calling code.

Before getting into the details of synchronized, we need to understand what thread safety is and why thread insecurity can occur. Look at the following code:

class ThreadNotSafeDemo {
    private static class Count {
        private int num;
        private void count() {
            for (int i = 1; i <= 10; i++) {
                num += i;
            }
            System.out.println(Thread.currentThread().getName() + "-" + num);
        }
    }
    public static void main(String[] args) {
        Runnable runnable = new Runnable() {
            Count count = new Count();
            public void run() { count.count(); }}; // Create 10 threads,for(int i = 0; i < 10; i++) { new Thread(runnable).start(); }}}Copy the code

In the code above, we create a Count class with a Count () method that computs the sum from 1 to 10 and outputs the name of the current thread and the result of the calculation. We expect the output from the thread to be an arithmetic sequence that starts with 55 and has 55 arithmetic. But the results were not what we expected. Specific results are shown in the figure below:

We can see that threads are not in the order we thought they were, thread0 to thread9, and the output of thread0 and Thread1 is wrong.

Thread 1 (num = 55) is scheduled to execute count() when the current Thread 0 (num = 55) is about to execute print statements. Thread0 will be at the position where the statement will be printed. When Thread1 performs the sum (num = 100), the CPU will schedule Thread0 to execute the print statement. Thread-1 pauses, and num is already 110, so thread-0 prints 110.

Thread-safe implementation

As we have seen above, the main reason for the problem of thread safety is that there are multiple threads working together to share data, and threads can be executed alternately during CPU scheduling. This causes the semantics of the program to change, so that we can have a violation of our expected results. So to solve this problem, there are two ways to handle this situation in Java.

Mutex synchronization (pessimistic locking)

When multiple threads operate shared data, ensure that only one thread is working on the shared data at the same time, and wait until the other threads finish processing the data.

The most basic form of mutex synchronization in Java is synchronized, which means that when a shared data is mutex locked by the thread currently accessing it, at the same time, Other threads can only wait until the current thread finishes processing and releases the lock.

In addition to synchronized, we can use ReentrantLock under the java.util.concurrent package for synchronization.

Non-blocking synchronization (Optimistic locking)

The main problem with mutex synchronization is the performance problem caused by thread blocking and waking up locks. To solve this problem, we have another solution. When multiple threads compete for a shared data, the thread that did not acquire the lock does not block, but keeps trying to acquire the lock until it succeeds. This scheme is implemented using a circular CAS operation.

Three ways to use synchronized

Given the problems synchronized solves, let’s move on to the use of synchronized in Java.

Synchronized is used in Java in three main ways. These are listed below

  • Modifies ordinary instance methods, for ordinary synchronous methods, locking the current instance object
  • Modifies static methods that, for statically synchronized methods, lock the Class object of the current Class
  • Modifies code blocks. For Synchronized method blocks, locks are objects of Synchronized configuration

Prove the current common synchronization method, locking the current instance object

To prove that in normal synchronization methods, the lock is the current object. Please observe the following code:

class SynchronizedDemo {

    public synchronized void normalMethod() {
        doPrint(5);
    }
 
    public void blockMethodSynchronized (this) {synchronized (this) {synchronized (this) {synchronized (this) {doPrint(5); }} // Prints the current thread information and corner values private static voiddoPrint(int index) {
        while (index-- > 0) {
            System.out.println(Thread.currentThread().getName() + "- >" + index);
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }

    public static void main(String[] args) {
        SynchronizedDemo demo = new SynchronizedDemo();
        new Thread(() -> demo.normalMethod(), "testNormalMethod").start();
        new Thread(() -> demo.normalMethod(), "testBlockMethod").start(); }}Copy the code

In the appeal code, two methods are created, normalMethod() and blockMethod (). NormalMethod () is a normal synchronized method, and blockMethod () is a synchronized block with configured objects of the current class. In the Main() method, two separate threads are created to execute two different methods.

Program output

Prove that for statically synchronized methods, the Class object of the current Class is locked

class SynchronizedDemo {
    public void blockMethod() {synchronized (SynchronizedDemo. Class) {/ / note that the synchronized block, block method is the class object of the current class configurationdoPrint(5);
        }
    }
    public static synchronized void staticMethod() {
        doPrint(5); } /** * prints information about the current thread */ private static voiddoPrint(int index) {
        while (index-- > 0) {
            System.out.println(Thread.currentThread().getName() + "- >" + index);
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
    public static void main(String[] args) {
        SynchronizedDemo demo = new SynchronizedDemo();
        new Thread(() -> demo.blockMethod(), "testBlockMethod").start();
        new Thread(() -> demo.staticMethod(), "testStaticMethod").start(); }}Copy the code

With the proof of the first result, the lock object for the statically synchronized method is no longer described (but it is important to note that the object configured in the synchronized method block is the Class object of the current Class). The output is given directly below:

The observation also clearly supports the conclusion that locking the current Class object for statically synchronized methods

The principle of Synchronized

The following article will focus on the principles of locking optimization implemented by the Java team after jdk1.6, which involves biased locking, lightweight locking, and heavyweight locking. This article involves JDK source code, here is the latest JDK source code to share with you —–> JDK source code)

Before understanding the principle of Synchronized, we need to know three knowledge points: the first is CAS operation, the second is Java object header (where Synchronized uses the lock in the object header), and the third is JDK1.6 lock optimization. After understanding the above three knowledge points, it is relatively easy to understand its principle. The CAS operation was explained in the previous article “Java CONCURRENT Programming Java CAS operation”, now we will explain the Java object header and lock optimization knowledge.

Memory layout of Java objects

In a Java virtual machine, the layout of an object stored in memory can be divided into three areas: object headers, Instance Data, and Padding. The object header in the VIRTUAL machine includes three parts of information, namely “Mark Word”, type pointer and data of record array length (optional), as shown in the following figure:

Composition of Java object headers

  • “Mark Word” : The first part is used to store runtime data for the object itself. For example, HashCode, GC generation age, lock status flag, lock held by thread, biased lock ID, and biased lock timestamp are 32-bit and 64-bit data respectively on 32-bit and 64-bit VMS, which are officially called “Mark Word”.
  • Type pointer: The other part of the object header is the type pointer, the pointer to the object’s class metadata, which the virtual machine uses to determine which class the object is an instance of. (Biased and lightweight locks introduced in Java SE 1.6 to reduce the performance cost of acquiring and releasing locks)
  • Record the length of the array: The remaining portion of the object header is used to record the length of the array (not if the object is not an array). If the object is a Java array, there must also be a piece of data in the object header to record the length of the array. Because a virtual machine can determine the size of a Java object from the metadata information of a normal Java object, it cannot determine the size of an array from the metadata in the array.

Mark Word data structure

As for “Mark Word”, since storing the object header information is an extra storage cost unrelated to the data defined by the object body, “Mark Word” is designed as a non-fixed data structure to store as much information as possible in a very small space, considering the space efficiency of the virtual machine. It reuses its own storage area based on the state of the object. In the JVM, the implementation of “Mark Word” is the markOopDesc class in the markoop.hpp file. Through the comments, we can roughly understand the structure of “Mark Word”, the specific code is as follows:

hash: Save the object's hash code age:GC generation age biASed_lock: bias lock flag lock: lock status flag JavaThread* Current thread epoch: save bias timestamp // 32 bits: // -------- //hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object) // JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object) / / omit part of the code / / 64 bits: / / -- -- -- -- -- -- -- -- / / unused: 25hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object) // JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 Lock :2 (Biased Object) // Omit some codeCopy the code

In the above code, there are two operating systems with different bits, 32-bit and 64-bit. The status of the current lock mark markOopDesc class is also described in detail, the specific code is as follows:

Enum {lockeD_value = 0,// Lightweight lock corresponds to [00] unlocked_value = 1,// Non-lock corresponds to [01] Monitor_Value = 2,// Heavyweight lock corresponds to [10] marked_value = 3,//GC mark corresponds to [11] biASed_LOCK_pattern = 5// if bias lock corresponds to [101] biased_lock one bit, lock two bit};Copy the code

Based on the above code, we can generate the following two tables for a 32-bit operating system:

The default “Mark Word” storage structure for a 32-bit JVM in a lock-free state

The default “Mark Word” storage structure for a 32-bit JVM in the locked state

Synchronized lock optimization

Java SE 1.6 introduces “biased locks” and “lightweight locks” to reduce performance costs associated with acquiring and releasing locks. In Java SE 1.6, there are four lock states, from lowest to highest: Lockless, biased, lightweight, and heavyweight lock states escalate as the competition progresses. Locks can be upgraded but cannot be downgraded, meaning biased locks cannot be downgraded after being upgraded to lightweight locks. The purpose of this policy is to improve the efficiency of acquiring and releasing locks. The various locks are described below.

  • Biased locking In most cases, the lock does not exist a multithreaded competition not only, and always by the same thread for many times, in order to get the thread lock cost lower and biased locking is introduced, when a thread access synchronized block, and obtain the lock, is in the object header “Mark word” and stack frame lock in the record store to lock in the thread ID. Later, the thread does not need to perform CAS operations to lock and unlock when it enters and exits the synchronized block. Simply test whether the “Mark Word” of the object header stores bias locks to the current thread. If the test succeeds, the thread has acquired the lock. If the test fails, check whether the bias lock identifier in “Mark Word” is set to 1 (indicating that the current bias lock is). If no, use CAS to compete for the lock. If set, try using CAS to point the bias lock of the object header to the current thread.
  • The JVM will create a space for storing lock records in the stack frame of the current thread and copy the Mark Word in the object header into the lock record before the lightweight lock thread executes the synchronized block, which is officially called the product Mark Word. The thread then tries to use CAS to replace the Mark Word in the object header with a pointer to the lock record. If it succeeds, the current thread acquires the lock; if it fails, other threads compete for the lock and the current thread attempts to acquire the lock using spin.
  • The product of the heavy product will be replaced back to the object head by atomic CAS operation when the lightweight product is unlocked, if no competition occurs. If this fails, it indicates that the current lock is competing, and the lock expands to a heavyweight lock, which causes the competing threads to synchronize mutually.

Synchronized low-level code implementation

With that in mind, let’s take a look at the synchronized low-level code implementation. From the JVM specification, you can see how Synchonized is implemented in the JVM. The JVM implements method synchronization and code block synchronization based on entering and exiting Monitor objects, but the implementation details are different. Code block synchronization is implemented using monitorenter and Monitorexit directives, while method synchronization is implemented using the bytecode synchronization directive ACC_SYNCHRONIZED, details of which are not specified in the JVM specification. However, synchronization of methods can also be achieved using these two instructions. Here we use the underlying principles of synchronized code blocks to explain.

Principle of ACC_SYNCHRONIZED The JVM supports synchronization by using Monitor. The JVM can tell whether a method is declared to be synchronized from the ACC_SYNCHRONIZED access flag in the method table structure of the method constant pool. When a method is called, The calling instruction checks if the method’s ACC_SYNCHRONIZED access flag is set. If so, the thread of execution requires that it successfully hold the Monitor before executing the method, and finally release the Monitor when the method completes (either normally or abnormally). During method execution, The thread of execution holds the pipe. No other thread can retrieve the same pipe again.

Underlying principles of synchronized code blocks

Before we look at the underlying principles of synchronized code blocks, let’s look at some of the ways we use synchronized code blocks.

public class SyncCodeBlock {
   public int i;
   public void syncTask(){// synchronized (this){i++; }}}Copy the code

We then decompile the bytecode using javap instructions.

/ / = = = = = = = = = = = main see syncTask method = = = = = = = = = = = = = = = = public void syncTask (); descriptor: ()V flags: ACC_PUBLIC Code: stack=3, locals=3, args_size=1 0: aload_0 1: dup 2: astore_1 3: Monitorenter // Notice here, enter synchronization method 4: aload_0 5: dUP 6: getField#2 // Field i:I
         9: iconst_1
        10: iadd
        11: putfield      #2 // Field i:I14: ALOAD_1 15: Monitorexit // Note here, exit synchronization method 16: goto 24 19: astore_2 20: ALOAD_1 21: Monitorexit // note here, exit synchronization method 22: aload_2 23: athrow 24:returnException table: // omit other bytecode....... }Copy the code

From the append code, we can see that when we declare synchronized blocks, the compiler produces monitorenter and Monitorexit directives accordingly. These instructions are parsed by our JVM when it loads bytecode into memory. Among them the monitorenter The command parsing is monitorenter through InterpreterRuntime. InterpreterRuntime in CPP file: : monitorenter and InterpreterRuntime: : monitorexit two functions respectively .

  • InterpreterRuntime::monitorenter(JavaThread* thread, BasicObjectLock* elem)
  • InterpreterRuntime::monitorexit(JavaThread* thread, BasicObjectLock* elem)

The first parameter is a pointer to the current thread. The second parameter is a pointer to the BasicObjectLock type.

BasicObjectLock

The specific class declaration for BasicObjectLock is in the basicLock. HPP file.

class BasicLock { friend class VMStructs; friend class JVMCIVMStructs; Private: / / point"Mark Word, "which is the pointer to markOopDesc that we mentioned. Where markOop is the alias for markOopDesc, volatile markOop _displaced_header; Public: // get"MarkOop displaced_header() const {markOop displaced_header() const {return_displaced_header; } void set_displaced_header(markOop header) { _displaced_header = header; } // omit some code}; class BasicObjectLock { friend class VMStructs; private: BasicLock _lock; // Have BasicLock object oop _obj; // omit some code};Copy the code

From this file, we know that there is a pointer to “Mark Word “in the BasicLock class, and BasicObjectLock also has a BasicLock object, so BasicObjectLock can access the contents of “Mark Word”. Now let’s look at the two corresponding methods mentioned above.

InterpreterRuntime: : monitorenter method

IRT_ENTRY_NO_ASYNC (void, InterpreterRuntime: : monitorenter (JavaThread * thread, BasicObjectLock * elem)) / / omit part of the codeif(UseBiasedLocking) {// Check whether biased locking is used // If biased locking is used, enter fast locking to avoid unnecessary expansion. ObjectSynchronizer::fast_enter(h_obj, elem->lock(),true, CHECK);
  } elseObjectSynchronizer::slow_enter(h_obj, elem->lock(), CHECK); } // omit some codeCopy the code

When the Monitorenter method executes, it determines whether biased locking is currently enabled. (Biased locking is enabled by default in Java 6 and Java 7, but it is activated several seconds after the application starts, using JVM arguments to turn off latency if necessary: – XX: BiasedLockingStartupDelay = 0. If you are sure that all locks in your application are normally content-locked, you can turn off biased locks with JVM arguments: -xx: -usebiasedlocking =false, then the program will enter the lightweight locking state by default), if not enabled, it will directly go to the lightweight locking, that is, the slow_enter () method.

Biased lock acquisition

The ObjectSynchronizer::fast_enter () method is declared in the sychronizer.cpp file:

void ObjectSynchronizer::fast_enter(Handle obj, BasicLock* lock,
                                    bool attempt_rebias, TRAPS) {
  if(UseBiasedLocking) {// If biased locking is usedif(! SafepointSynchronize::is_at_safepoint()) { Gets the current state of biased locking (may revoke and heavy bias) BiasedLocking: : Condition cond = BiasedLocking: : revoke_and_rebias (obj, attempt_rebias, THREAD);if(cond == BiasedLocking::BIAS_REVOKED_AND_REBIASED) {// If it is undo and rebiased, return directlyreturn; }}else{// If at a safe point, it is possible to undo bias lock assert(! attempt_rebias,"can not rebias toward VM thread"); BiasedLocking::revoke_at_safepoint(obj); } slow_enter(obj, lock, THREAD); // If bias lock is not used, go to the lightweight lock acquisition}Copy the code

In this method, if the current JVM supports bias locks, it waits for the global security point (at which no bytecode is being executed), and if it is not currently in the security point, the revoke_and_rebias () method is called to get the status of the current bias lock (possibly revoke or rebias after revoke). If it is at the safe point, it determines whether to revoke the biased lock according to the current biased lock status. The revoke_and_rebias () method is declared in biasedLocking. CPP.

BiasedLocking::revoke_and_rebias () method

BiasedLocking::Condition BiasedLocking::revoke_and_rebias(Handle obj, bool attempt_rebias, TRAPS) { assert(! SafepointSynchronize::is_at_safepoint(),"must not be called while at safepoint"); markOop mark = obj->mark(); If no other thread occupies the object (the mark Word thread id is 0, the last three digits are 101, and rebias is not attempted) // here "fast Enter ()" method"传入的attempt_rebias为true
  if (mark->is_biased_anonymously() && !attempt_rebias) {
    //一般来讲,只有在重新计算对象hashCode的时候才会进入该分支,
    //所以直接用用CAS操作将对象设置为无锁状态
    markOop biased_value       = mark;
    markOop unbiased_prototype = markOopDesc::prototype()->set_age(mark->age());
    markOop res_mark = obj->cas_set_mark(unbiased_prototype, mark);//cas 操作从新设置偏向锁的状态
    if (res_mark == biased_value) {//如果CAS操作失败,说明存在竞争,偏向锁为撤销状态
      return BIAS_REVOKED;
    }
  } else if (mark->has_bias_pattern()) {
    //第二步,判断当前偏向锁是否已经锁定(不管mark word中线程id是否为null),尝试重偏向
    Klass* k = obj->klass();
    markOop prototype_header = k->prototype_header();
    if (!prototype_header->has_bias_pattern()) {
     //第三步如果有线程对该对象进行了全局锁定(即同步了静态方法/属性),则取消偏向操作
      markOop biased_value       = mark;
      markOop res_mark = obj->cas_set_mark(prototype_header, mark);
      assert(!obj->mark()->has_bias_pattern(), "even if we raced, should still be revoked"); return BIAS_REVOKED; } else if (prototype_header->bias_epoch()! Attempt_rebias = attempt_rebias () {attempt_rebias () {attempt_rebias () {attempt_rebias () {attempt_rebias () {attempt_rebias () { Update the timestamp and generational age through the CAS operation again. assert(THREAD->is_Java_thread(), ""); markOop biased_value = mark; markOop rebiased_prototype = markOopDesc::encode((JavaThread*) THREAD, mark->age(), prototype_header->bias_epoch()); markOop res_mark = obj->cas_set_mark(rebiased_prototype, mark); if (res_mark == biased_value) { return BIAS_REVOKED_AND_REBIASED; // Undo the bias after offset. MarkOop biASed_value = mark; // If the bias lock is closed, update the generational age by CAS; markOop unbiased_prototype = markOopDesc::prototype()->set_age(mark->age()); markOop res_mark = obj->cas_set_mark(unbiased_prototype, mark); if (res_mark == biased_value) { return BIAS_REVOKED; //// If the CAS operation fails, there is a race, and the biased lock is revoked}}}} // Omit some code... }Copy the code

Obtaining BiasedLocking is realized by BiasedLocking::revoke_and_rebias method, which is mainly divided into five steps

  1. First, determine whether the thread ID in the “Mark word” of the current bias lock is null and attempt_rebias =false. If the conditions are met, try to set the current object to lockless state through the CAS operation. If the CAS operation fails, contention exists, and the bias lock is revoked.
  2. Second, determine whether the current bias lock is locked (regardless of whether the thread ID in Mark Word is null or not), and proceed to the third, fourth, and fifth steps according to the current conditions.
  3. Third, if a thread globally locks the object (that is, synchronizes static methods/properties), the bias lock is revoked.
  4. The fourth step is to judge whether the biased lock time is expired (at this time, another thread has acquired the lock of the object through the biased lock), and then proceed to the condition judgment of the fifth and sixth steps
  5. Fifth step, under the condition of biased lock time expiration, if biased lock is enabled, then CAS operation is performed to update the timestamp, generation age and thread ID. If failure, it indicates that the lock status of the object has changed from revoked to another thread. The current bias lock is in revoked rebias state.
  6. Step 6: Under the condition that biased lock time expires, if biased lock is closed by default, then the generation age is updated through CAS operation. If it fails, it indicates that there are threads competing, and biased lock is revoked.

Bias lock revocation

Revoke_at_safepoint () : revoke_at_safepoint () : revoke_at_safepoint () : revoke_at_safepoint (); This method is also declared in biasedLocking. CPP, and the specific code is as follows:

void BiasedLocking::revoke_at_safepoint(Handle h_obj) {
  assert(SafepointSynchronize::is_at_safepoint(), "must only be called while at safepoint");
  oop obj = h_obj();
  HeuristicsResult heuristics = update_heuristics(obj, false); // Get the number of bias lock bias and undoif(heuristics == HR_SINGLE_REVOKE) {revoke_bias(obj,false.false, NULL, NULL);
  } else if((heuristics = = HR_BULK_REBIAS) | | / / if it is revoked or many times repeatedly bias (heuristics = = HR_BULK_REVOKE)) { bulk_revoke_or_rebias_at_safepoint(obj, (heuristics == HR_BULK_REBIAS),false, NULL);
  }
  clean_up_cached_monitor_info();
}
Copy the code

If we look at the code, we can see that it takes a different approach depending on how many times the current bias lock bias and undo is. Revoke_bias () is used as an example. The specific code is as follows:

static BiasedLocking::Condition revoke_bias(oop obj, bool allow_rebias, bool is_bulk, JavaThread* requesting_thread, JavaThread** biased_locker) {// omit some code... uint age = mark->age(); markOop biased_prototype = markOopDesc::biased_locking_prototype()->set_age(age); markOop unbiased_prototype = markOopDesc::prototype()->set_age(age); JavaThread* biased_thread = mark->biased_locker();if(biased_thread == NULL) {// Check whether the thread id of the bias lock is NULLif(! Allow_rebias) {// If rebias is not allowed, make the bias lock unavailable. obj->set_mark(unbiased_prototype); } // omit some code...returnBiasedLocking::BIAS_REVOKED; } // Bool thread_is_alive = bool thread_is_alive =false;
  if (requesting_thread == biased_thread) {
    thread_is_alive = true;
  } else {
    ThreadsListHandle tlh;
    thread_is_alive = tlh.includes(biased_thread);
  }
  if(! Thread_is_alive) {// If the current thread of bias lock bias is not aliveif(allow_rebias) { obj->set_mark(biased_prototype); // If bias is allowed, set thread ID in bias lock to null}else{ obj->set_mark(unbiased_prototype); // Otherwise, set bias lock to unlocked state, i.e. 01}returnBiasedLocking::BIAS_REVOKED; } // Run the product where the bivariate lock is located and write the product to the thread stack. GrowableArray<MonitorInfo*>* cached_monitor_info = get_or_compute_monitor_info(biased_thread); BasicLock* highest_lock = NULL;for (int i = 0; i < cached_monitor_info->length(); i++) {
    MonitorInfo* mon_info = cached_monitor_info->at(i);
    if(oopDesc::equals(mon_info->owner(), obj)) { markOop mark = markOopDesc::encode((BasicLock*) NULL); highest_lock = mon_info->lock(); highest_lock->set_displaced_header(mark); // write dispalece headers to the stack} // omit some code... }if(highest_lock ! = NULL) {/ / will need to be displaced headers to thread stack / / omit part of the code... highest_lock->set_displaced_header(unbiased_prototype); // omit some code... obj->release_set_mark(markOopDesc::encode(highest_lock)); // omit some code... }else{// Restore the header of the object to an unlocked or unbiased state // omit some code...if (allow_rebias) {
      obj->set_mark(biased_prototype);
    } else {
      // Store the unlocked value into the object's header. obj->set_mark(unbiased_prototype); }} // get bias lock pointing thread if (biased_locker! = NULL) { *biased_locker = biased_thread; } return BiasedLocking::BIAS_REVOKED; }Copy the code

In order to revoke a bias lock, it needs to wait for the global global point (this point in time when no bytecode is executing), which first suspends the thread with the bias lock, and then checks whether the thread with the bias lock is alive if the thread is not active. Will be more biased locking is set to the unlocked state, if the thread is still alive, stack with biased locking may be implemented, traverse to the object’s lock record, record and lock objects in the stack head Mark Word or to tend to other threads, or back to no lock or tag object doesn’t fit as biased locking, finally wake up suspended thread.

Lightweight lock acquisition

Above, we said that when monitorenter executes, the lightweight lock is acquired directly if the current bias lock is not open or if multiple threads compete for the bias lock and upgrade it to a lightweight lock. Product of lightweight lock product product product “product Mark Word”

Product of lightweight Product product product

If the synchronization object is not locked (the Lock flag is 01) before the code enters the synchronization block to perform the lightweight Lock acquisition, the JVM creates a space called the Lock Record in the current thread’s frame stack. Object is used to store the current “Mark Word” copy (officials add that copy a Displaced prefix, and Displaced Mark Word). The VIRTUAL machine will use the CAS operation to attempt to update the object’s “Mark Word” to a pointer to the Lock Record. If the update action succeeds, the site owns the Lock on the object and the object is in a lightweight locked state. As for the acquisition of lightweight locks, the specific schematic diagram is as follows:

ObjectSynchronizer::slow_enter () method

Now that you’ve looked at the specific lightweight lock acquisition process, let’s look at the specific slow_Enter () method implemented. This method is declared in the sychronizer.cpp file. The specific code is as follows:

void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) { markOop mark = obj->mark(); // Get the "mark word" of the lock objectIf (mark->is_neutral()) {// If (mark->is_neutral()) {// If (mark->is_neutral()) {// If (mark->is_neutral()) {// If (mark->is_neutral()) {// If (mark->is_neutral()) {// If (mark->is_neutral()) {// If (mark->is_neutral()) {Lock ->set_displaced_header(Mark);if(mark == obj()->cas_set_mark((markOop) lock, mark)) { TEVENT(slow_enter: release stacklock); // If the update is successful, the lock is obtained.return; If mark is currently locked and the owner in thread frame stack points to the current lock, the synchronization code will be executed.else if (mark->has_locker() &&
             THREAD->is_lock_owned((address)mark->locker())) {
    lock->set_displaced_header(NULL);
    return; } // Step 5, otherwise, there are multiple threads competing for the lightweight lock, and the lightweight lock needs to be expanded to the heavyweight lock; lock->set_displaced_header(markOopDesc::unused_mark()); ObjectSynchronizer::inflate(THREAD, obj(), inflate_cause_monitor_enter)->enter(THREAD); }Copy the code

In the acquisition of lightweight lock, there are five main steps, the main steps are as follows:

  1. The markOop mark = obj->mark() method gets the markOop data mark of the lock object.
  2. Mark ->is_neutral() then mark is 0 and 01 respectively.
  3. Step 3: If there is no lock state, store the current “Mark Word” copy of the object, try to update the” Mark Word” of the lock object to the pointer to the Lock Record object through CAS, if the update is successful, it means that the lock is competing, and then execute the synchronization code.
  4. Step 4: If there is a lock and the owner in the thread frame stack points to the current lock, the synchronization code is executed.
  5. Step 5: If none is met, it indicates that there are multiple threads competing for the lightweight lock, and the lightweight lock needs to be expanded to the heavyweight lock.

If (mark->is_neutral())

  1. Each thread AB copies the Mark Word to its own Lock Record space. This data is stored on the stack frame of the thread and is private to the thread.
  2. The CAS operation ensures that only one thread can copy the pointer to the stack frame into the Mark Word, assuming that thread A succeeds and returns to continue executing the synchronized code block.
  3. Thread B on failure, exit the critical section, through ObjectSynchronizer: : inflate method began expanding lock (lock lightweight expansion for the heavyweight lock)

Revocation of lightweight locks

In this paper, we speak when walk the synchronized block, executes monitorexit instruction, and the release of lightweight lock that is when monitorexit execution, namely InterpreterRuntime: : monitorexit ().

IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorexit(JavaThread* thread, BasicObjectLock* elem))
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
  Handle h_obj(thread, elem->obj());
  assert(Universe::heap()->is_in_reserved_or_null(h_obj()),
         "must be NULL or an object");
  if (elem == NULL || h_obj()->is_unlocked()) {
    THROW(vmSymbols::java_lang_IllegalMonitorStateException());
  }
  ObjectSynchronizer::slow_exit(h_obj(), elem->lock(), thread);
  // Free entry. This must be done here, since a pending exception might be installed on
  // exit. If it is not cleared, the exception handling code will try to unlock the monitor again.
  elem->set_obj(NULL);
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
IRT_END
Copy the code

The slow_exit() method is called internally in the Monitorexit () method and the fast_exit () method is called internally in the slow_exit() method. Let’s look at the fast_exit () method.

void ObjectSynchronizer::fast_exit(oop object, BasicLock* lock, TRAPS) { markOop mark = object->mark(); // omit some code... markOop dhw = lock->displaced_header(); // Retrieve the product of the thread stackif(DHW == NULL) {// If the product in the stack is bug Mark Word is NULL#ifndef PRODUCT
    if(mark ! = markOopDesc::INFLATING()) {// If the current lightweight lock is not INFLATING into a heavyweight lock // omit some code...if(mark->has_monitor()) {ObjectMonitor * m = mark->monitor(); assert(((oop)(m->object()))->mark() == mark,"invariant");
        assert(m->is_entered(THREAD), "invariant"); }}#endif
    return; } // If the current thread has the product of the herbivore product, replace the product of the herbivore product with the product of the herbivore product via CASif (mark == (markOop) lock) {
    assert(dhw->is_neutral(), "invariant");
    if (object->cas_set_mark(dhw, mark) == mark) {
      TEVENT(fast_exit: release stack-lock);
      return; }} // If the CAS operation fails, another thread is trying to acquire the lightweight lock, This point needs to be lightweight lock escalation to heavyweight ObjectSynchronizer: lock: inflate (THREAD, object, inflate_cause_vm_internal) - >exit(true, THREAD);
}
Copy the code

In the release of bias locks, there are several steps.

  1. Gets the product Mark Word in the thread stack
  2. Product Mark Word in the thread stack is null if the product is already a heavyweight lock, return it directly.
  3. If the current thread has a lightweight lock, replace the product of the Hermite product with the Mark Word of the current lock object by CAS. If CAS succeeds, the product is released successfully
  4. If the CAS operation fails, it indicates that another thread is trying to acquire the lightweight lock. In this case, the lightweight lock needs to be upgraded to the heavyweight lock.

Heavyweight lock acquisition

In the previous article, we mentioned that it is possible for a lightweight lock to expand to a heavyweight lock when multiple threads acquire the lightweight lock or when the lightweight lock is revoked. Now let’s look at the process of expansion

ObjectMonitor* ObjectSynchronizer::inflate(Thread * Self,
                                           oop object,
                                           const InflateCause cause) {
  EventJavaMonitorInflate event;

  for(;;) Const markOop mark = object->mark(); // The mark can bein one of the following states:
    // *  Inflated     - just return
    // *  Stack-locked - coerce it to inflated
    // *  INFLATING    - busy wait forconversion to complete // * Neutral - aggressively inflate the object. // * BIASED - Illegal. We should never see this / / 1. Returns if the current lock is already a heavyweight lockif (mark->has_monitor()) {
      ObjectMonitor * inf = mark->monitor();
      returninf; } //2. If you are in the process of ballooning, other threads must wait while the ballooning process is complete.if (mark == markOopDesc::INFLATING()) {
      TEVENT(Inflate: spin while INFLATING);
      ReadStableMark(object);
      continue; } //3. If it is currently a lightweight lock, force it to expand to a heavyweight lockif (mark->has_locker()) {
      ObjectMonitor * m = omAlloc(Self);
      m->Recycle();
      m->_Responsible  = NULL;
      m->_recursions   = 0;
      m->_SpinDuration = ObjectMonitor::Knob_SpinLimit;   // Consider: maintain by type/class

      markOop cmp = object->cas_set_mark(markOopDesc::INFLATING(), mark);
      if(cmp ! = mark) { omRelease(Self, m,true);
        continue;       // Interference -- just retry
      }

      markOop dmw = mark->displaced_mark_helper();
      assert(dmw->is_neutral(), "invariant"); m->set_header(dmw); m->set_owner(mark->locker()); m->set_object(object); // TODO-FIXME: assert BasicLock->dhw ! = 0. // Must preserve store ordering. The monitor state must // be stable at the time of publishing the monitor address.  guarantee(object->mark() == markOopDesc::INFLATING(),"invariant");
      object->release_set_mark(markOopDesc::encode(m));

      // Hopefully the performance counters are allocated on distinct cache lines
      // to avoid false sharing on MP systems ...
      OM_PERFDATA_OP(Inflations, inc());
      TEVENT(Inflate: overwrite stacklock);
      if (log_is_enabled(Debug, monitorinflation)) {
        if (object->is_instance()) {
          ResourceMark rm;
          log_debug(monitorinflation)("Inflating object " INTPTR_FORMAT " , mark " INTPTR_FORMAT " , type %s", p2i(object), p2i(object->mark()), object->klass()->external_name()); }}if (event.should_commit()) {
        post_monitor_inflate_event(&event, object, cause);
      }
      returnm; } //4. Reset monitor state assert(mark->is_neutral(),"invariant");
    ObjectMonitor * m = omAlloc(Self);
    // prepare m for installation - set monitor to initial state
    m->Recycle();
    m->set_header(mark);
    m->set_owner(NULL);
    m->set_object(object);
    m->_recursions   = 0;
    m->_Responsible  = NULL;
    m->_SpinDuration = ObjectMonitor::Knob_SpinLimit;       // consider: keep metastats by type/class

    if(object->cas_set_mark(markOopDesc::encode(m), mark) ! = mark) { m->set_object(NULL); m->set_owner(NULL); m->Recycle(); omRelease(Self, m,true);
      m = NULL;
      continue; } // omit some code...return m ;
}
Copy the code

The expansion of lightweight locks into heavyweight locks can be roughly divided into the following processes

  1. If the current lock is already a heavyweight lock, return the ObjectMonitor object directly.
  2. If it is in the process of ballooning, while completing the ballooning, other threads spin to wait. One thing to note here is that while it spins, it does not consume CPU resources all the time and suspends threads with spin/yield/park.
  3. If it is currently a lightweight lock, force it to expand to a heavyweight lock
  4. If there is no lock, reset the state in ObjectMonitor.

Lock Upgrade Diagram

After understanding the principles of biased locking, lightweight locking, and heavyweight locking, let’s summarize the entire lock upgrade process. The details are shown in the figure below:

Bias lock acquisition and undo

Flow chart of lightweight lock expansion

Heavyweight lock competition

In the above, we mainly introduced the whole lock upgrade process and source code implementation. The wait and contention of real threads have not been described in detail. Let’s talk about the whole thread contention and wait process when a lock expands to a heavyweight lock. Competition for heavyweight locks is implemented in objectMonitor :: Enter () in objectMonitor.cpp.

ObjectMonitor structure

Before we get into the specifics of lock acquisition, we need to understand that each lock object (in this case, an object that has been upgraded to a heavyweight lock) has an ObjectMonitor. That is, each thread gets the lock object through ObjectMonitor. The code looks like this :(I’ve omitted some unnecessary attributes here. You just need to look at some of the key structures.)

class ObjectMonitor { public: Enum {OM_OK, // No error OM_SYSTEM_ERROR, // system error OM_ILLEGAL_MONITOR_STATE, // Monitor status is interrupted OM_INTERRUPTED, OM_TIMED_OUT // Thread wait timeout}; volatile markOop _header; // Mark word of the lock object stored in the thread frame stack protected: // protectedforJvmtiRawMonitor void * volatile _owner; Volatile jLONG _previous_owner_TID; volatile jlong _previous_owner_tid; // Id of volatile intptr_t _recursions; * volatile _EntryList; * volatile _EntryList; * volatile _EntryList; Protected: ObjectWaiter * volatile _WaitSet; // The threads in wait state are volatile jint _waiters; // The number of threads in the wait stateCopy the code

Heavyweight lock acquisition

ObjectMonitor:: Enter () : lock () ObjectMonitor:: Enter () : lock ()

void ObjectMonitor::enter(TRAPS) { Thread * const Self = THREAD; // The thread currently entering the Enter method // try setting the _owner of monitor (pointing to the thread or BasicLock object that got the objectMonitor) to the current thread void * cur = via CAS Atomic::cmpxchg(Self, &_owner, (void*)NULL);if(cur == NULL) {assert(_recursions == 0,"invariant");
    assert(_owner == Self, "invariant");
    return; } // If it is the same thread, the current number of reentrants is recorded (the previous CAS operation will return the address pointed to by _owner regardless of success or failure).if (cur == Self) {
    _recursions++;
    return; } // If BasicLock is on the current thread stack, the current thread entered the monitor for the first time, and set _owner to 1.if (Self->is_lock_owned ((address)cur)) {
    assert(_recursions == 0, "internal state error");
    _recursions = 1;
    _owner = Self;
    return; } // omit some code... // Start contending for locksfor (;;) {
      jt->set_suspend_equivalent();
      EnterI(THREAD);
      if(! ExitSuspendEquivalent(jt))break;
      _recursions = 0;
      _succ = NULL;
      exit(false, Self); jt->java_suspend_self(); } Self->set_current_pending_monitor(NULL); } // omit some code... }Copy the code

In the heavyweight lock competition steps, mainly divided into the following steps:

  1. Try setting the _owner of monitor (referring to the thread or BasicLock object that obtained the objectMonitor) to the current thread through the CAS operation. If the CAS operation succeeds, the thread has acquired the lock and executes the synchronized code block directly.
  2. If the lock is reentrant by the same thread, the current number of reentrants is recorded.
  3. If steps 2,3 are not met, the lock contention starts and the EnterI() method is used.
EnterI() method is implemented
void ObjectMonitor::EnterI(TRAPS) { Thread * const Self = THREAD; // omit some code... // Encapsulate the current thread as an ObjectWaiter node object and set the state to ObjectWaiter::TS_CXQ; ObjectWaiter node(Self); Self->_ParkEvent->reset(); node._prev = (ObjectWaiter *) 0xBAD; node.TState = ObjectWaiter::TS_CXQ; //TS_CXQ: is in contention lock state // inforIn the loop, CAS is used to push nodes to the _CXQ list. ObjectWaiter * nxt;for(;;) { node._next = nxt = _cxq; // If the CAS operation fails, continue trying because the _CXQ list has changedif (Atomic::cmpxchg(&node, &_cxq, nxt) == nxt) break; // It is possible that the lock was acquired when the _CXQ list was addedif(TryLock (Self) > 0) { assert(_succ ! = Self,"invariant");
      assert(_owner == Self, "invariant"); assert(_Responsible ! = Self,"invariant");
      return; }} // After pushing the node node to the _CXQ list, try to get the lock by spinningfor (;;) {

    if (TryLock(Self) > 0) break; // Try to get the lock assert(_owner! = Self,"invariant");

    if((SyncFlags & 2) && _Responsible == NULL) { Atomic::replace_if_null(Self, &_Responsible); } // Count the number of times a loop is executed. If no lock is obtained after the loop is executed, the current thread is suspended through the park function, waiting to be woken upif (_Responsible == Self || (SyncFlags & 1)) {
      TEVENT(Inflated enter - park TIMED);
      Self->_ParkEvent->park((jlong) recheckInterval);
      // Increase the recheckInterval, but clamp the value.
      recheckInterval *= 8;
      if(recheckInterval > MAX_RECHECK_INTERVAL) {MAX_RECHECK_INTERVAL is 1000 recheckInterval = MAX_RECHECK_INTERVAL; }}else{ TEVENT(Inflated enter - park UNTIMED); Self->_ParkEvent->park(); } // omit some code... OrderAccess::fence(); } // omit some code...return;
}
Copy the code

The EnterI() method can be divided into the following steps:

  1. Encapsulate the current thread as an ObjectWaiter node object and set the thread state to TS_CXQ.
  2. In the for loop, the node is pushed to the _CXQ list by CAS. If the CAS operation fails, continue the for loop because the _CXQ list has changed.
  3. After pushing the node to the _CXQ list, spin the node to try to obtain the lock (TryLock). If the lock is not obtained after a certain number of cycles, suspend the node through the park function. (Does not consume CPU resources)

The TryLock method for obtaining the lock is as follows:

TryLock method
int ObjectMonitor::TryLock(Thread * Self) {
  void * own = _owner;
  if(own ! = NULL)return 0;
  if (Atomic::replace_if_null(Self, &_owner)) {
    return 1;
  }
  return- 1; }Copy the code

This function simply points the _owner pointer in the lock to the current thread and returns 1 on success or -1 on failure.

Release of heavyweight locks

void ObjectMonitor::exit(bool not_suspended, TRAPS) {
  Thread * const Self = THREAD;
  if(THREAD ! = _owner) {// If _owner in the current lock object does not point to the current threadif(THREAD->is_lock_owned((address) _owner)) {// If the BasicLock that _owner points to is on the current THREAD stack, then _owner points to the current THREAD assert(_recursions == 0,"invariant");
      _owner = THREAD;
      _recursions = 0;
    } else{// omit some code...return; }} // If the number of times the thread has re-entered the lock is not 0, then the ObjectMonitor::exit, until the reentrant lock count is 0if(_recursions ! = 0) { _recursions--; // this is simple recursive enter TEVENT(Inflatedexit - recursive);
    return; } // omit some code...for (;;) {

    if(Knob_ExitPolicy == 0) { OrderAccess::release_store(&_owner, (void*)NULL); OrderAccess:: storeLoad (); // Seeif we need to wake a successor
      if((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ ! = NULL) { TEVENT(Inflatedexit - simple egress);
        return;
      }
      TEVENT(Inflated exit- complex egress); // omit some code... } // omit some code... ObjectWaiter * w = NULL; int QMode = Knob_QMode; * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *if(QMode == 2 && _cxq ! = NULL) { w = _cxq; ExitEpilog(Self, w);return; } // omit some code... }} // Omit part of the code for different QMode, different wake up mechanism}} // omit part of the code... }Copy the code

Release of heavyweight locks can be divided into the following steps:

  1. Determine that _OWNER in the current lock object does not point to the current thread, and if the BasicLock that _OWNER points to is on the current thread stack, then _OWNER points to the current thread.
  2. If _owner in the current lock object points to the current thread, the number of times the current thread has re-entered the lock is determined. If it is not 0, then ObjectMonitor::exit () is reentered until the number of re-entered locks is 0.
  3. Release the current lock and determine whether to wake up the pending thread in _CXQ based on the QMode mode. Or something else.

feeling

It took so long, finally finished ~~~ where is the applause?

This article is mainly based on the first close blog and their understanding of the source code, found that in fact there are a lot of things themselves or description is not very clear. The main reason is the C++ code looks at my head. Personally feel that the whole Java lock mechanism actually involves a lot of things, their understanding is only the tip of the iceberg, if you do not understand the code or article, please spray. I also read half understand half understand. Forgive me

reference

You can see farther on the shoulders of giants

Understanding the Java Virtual Machine in Depth: Advanced JVM Features and Best Practices

The Art of Concurrent Programming in Java

In-depth understanding of Java concurrency implementation principles of Synchronized

JDK source code analysis two: object memory layout, the ultimate principle of synchronized

The JDK source