Introduction to the

Synchronized is an important keyword used to solve the problem of concurrent data access in Java. When we want to ensure that a shared resource can only be accessed by one thread at a time, we can use the synchronized keyword to lock classes or objects in our code. Synchronized has long been an elder statesman in multithreaded concurrent programming, and many would call it a heavyweight lock. However, with the various optimizations Java SE 1.6 has made to synchronized, in some cases it’s not that heavy.

The basic syntax of synchronized

Every object in Java can be used as a lock in the following three ways

  • Modifier instance method, where the lock is the current instance object.
  • Static method. The lock is the Class object of the current Class.
  • Modify the code block, and the lock is the object configured in Synchonized parentheses.

Implementation principle of Synchronized

User mode and kernel mode

Kernel mode: Access all hardware and software resources, including peripheral devices, such as hard disks and network cards.

User mode: Memory access is limited and peripheral device access is not allowed.

In the early days of JDK, synchronized was called heavyweight locking. Java threads were mapped to operating System native threads. To block or wake up a thread, you needed to go through the kernel and System Call, which required a transition from user mode to core state. As a result, state transitions take a lot of processor time, and for simple synchronized blocks of code (such as get or SET methods modified by synchronized) state transitions can take longer than user code execution, making synchronized a heavyweight manipulation in the Java language.

So, in Java SE 1.6, there are a lot of optimizations for locks, resulting in lightweight locks, biased locks, lock elimination, adaptive spin locks, lock coarse-all of which are implemented in user processes.

CAS

Compare And Swap (Compare And Exchange), which is an optimistic lock-based operation. It has three operands, memory value V, expected value A, and updated value B. If and only if A and V are the same, V will be changed to B, returning the old value, otherwise nothing will be done. Java’s AtomicInteger and other lock-free classes use CAS.

Final implementation:

Lock CMPexg assembly instruction, used to compare and exchange operands, CPU primitive support for CAS

The cMPexge instruction is non-atomic and was originally used on single-core cpus. The JVM implements atomic_linux_x86.inline-hpp as follows:

inline jint     Atomic::cmpxchg    (jint     exchange_value, volatile jint*     dest, jint     compare_value) {
  int mp = os::is_MP(a);// If the processor is Multiprocessor, add or subtract the Lock instruction to ensure atomicity
  __asm__ volatile (LOCK_IF_MP(%4) "cmpxchgl %1,(%3)"
                    : "=a" (exchange_value)
                    : "r" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp)
                    : "cc"."memory");
  return exchange_value;
}
Copy the code

The LOCK instruction is used for exclusive use of shared memory when executing instructions in a multiprocessor, locking a northbridge signal (without bus LOCK) while executing subsequent instructions.

It flushes the current processor’s cache to memory and invalidates other processors’ caches. It also provides the memory barrier for ordered instructions that cannot be crossed

The bytecode

In Java, synchronized is used in the following ways

public class SynchronizedTest {
    public synchronized void syncMethod(a){
        System.out.println("syncMethod");
    }

    public void syncObject(a){
        synchronized (SynchronizedTest.class){
            System.out.println("syncObject"); }}}Copy the code

Decompile the bytecode using Javap as follows:

 public synchronized void syncMethod(a);
    descriptor: ()V
    flags: ACC_PUBLIC, ACC_SYNCHRONIZED
    Code:
      stack=2, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: ldc           #3                  // String syncMethod
         5: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;) V
         8: return.public void syncObject(a);
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=3, args_size=1
         0: ldc           #5                  // class com/marvin/cloud/test/SynchronizedTest
         2: dup
         3: astore_1
         4: monitorenter
         5: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         8: ldc           #6                  // String syncObject
        10: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;) V
        13: aload_1
        14: monitorexit
      ...
Copy the code

It can be seen from the above that the bytecode generated by synchronized modification method and code block is slightly different. The method-level synchronization is implicit, that is, it does not need to be controlled by bytecode instructions. The JVM can distinguish whether a method is synchronized from the ACC_SYNCHRONIZED access flag in the method_info Structure in the method constant pool. For synchronous code blocks. The JVM uses monitorenter and Monitorexit to implement synchronization. The JVM guarantees that each Monitorenter must have a Monitorexit directive corresponding to it. When the thread executes at the Monitorenter instruction, it attempts to acquire ownership of the lock object corresponding to the object.

JVM implementation

In the Java virtual machine (HotSpot, JDK12), the monitorenter bytecode entry is implemented based on C++ and is implemented by bytecodeinterpreter.cpp. Its main implementation is as follows:

CASE(_monitorenter): {
        oop lockee = STACK_OBJECT(- 1); // Get the lock object
        // derefing's lockee ought to provoke implicit null check
        CHECK_NULL(lockee);
        // find a free monitor or one already allocated for this object
        // if we find a matching object then we need a new monitor
        // since this is recursive enter
        BasicObjectLock* limit = istate->monitor_base(a); BasicObjectLock* most_recent = (BasicObjectLock*) istate->stack_base(a); BasicObjectLock* entry =NULL;
        while(most_recent ! = limit ) {if (most_recent->obj() = =NULL) entry = most_recent;
          else if (most_recent->obj() == lockee) break;
          most_recent++;
        }
        // Find the free Lock Record location in the current thread stack to store the BasicObjectLock execution Lock object
        if(entry ! =NULL) {
          entry->set_obj(lockee);
          int success = false;
          uintptr_t epoch_mask_in_place = markWord::epoch_mask_in_place;

          markWord mark = lockee->mark(a);intptr_t hash = (intptr_t) markWord::no_hash;
          // The lock object header is in biased state
          if (mark.has_bias_pattern()) {
            uintptr_t thread_ident;
            uintptr_t anticipated_bias_locking_value;
            thread_ident = (uintptr_t)istate->thread(a); anticipated_bias_locking_value = ((lockee->klass() - >prototype_header().value() | thread_ident) ^ mark.value()) &
              ~(markWord::age_mask_in_place);
		   // The biased thread is its own nothing to do
            if  (anticipated_bias_locking_value == 0) {
              // already biased towards this thread, nothing to do
              if (PrintBiasedLockingStatistics) {
                (* BiasedLocking::biased_lock_entry_count_addr()) + +; } success =true;
            }
            // If the Klass object's head bias lock flag bit is 0, the batch bias lock is revoked, then the attempt is made to undo the bias lock
            else if((anticipated_bias_locking_value & markOopDesc::biased_lock_mask_in_place) ! =0) {
              markWord header = lockee->klass() - >prototype_header(a);if(hash ! = markWord::no_hash) { header = header.copy_set_hash(hash);
              }
              // Try to undo the bias lock and replace the markword of the locked object with Klass markword
              if (lockee->cas_set_mark(header, mark) == mark) {
                if (PrintBiasedLockingStatistics)
                  (*BiasedLocking::revoked_lock_entry_count_addr()) + +; }}// If the epoch of the locked object is not equal to the epoch of Klass, try rebias (batch rebias is triggered, the biased thread is released, can be directly biased to the current thread).
            else if((anticipated_bias_locking_value & epoch_mask_in_place) ! =0) {
              // try rebias constructs a mark word biased towards the current thread
              markWord new_header((intptr_t) lockee->klass() - >prototype_header().value() | thread_ident);
              if(hash ! = markWord::no_hash) { new_header = new_header.copy_set_hash(hash);
              }
              //CAS updates the lock object's markword
              if (lockee->cas_set_mark(new_header, mark) == mark) {
                if (PrintBiasedLockingStatistics)
                  (* BiasedLocking::rebiased_lock_entry_count_addr()) + +; }//CAS fails. Lock contention exists. Enter lock to upgrade monitorenter
              else {
                CALL_VM(InterpreterRuntime::monitorenter(THREAD, entry), handle_exception);
              }
              success = true;
            }
            // Favoring other threads or anonymously
            else {
              // Build an anonymous biased markword
              // try to bias towards thread in case object is anonymously biased
              markWord header(mark.value() & (markWord::biased_lock_mask_in_place |
                                              markWord::age_mask_in_place |
                                              epoch_mask_in_place));
              if(hash ! = markWord::no_hash) { header = header.copy_set_hash(hash);
              }
              // Build a markword that points to the current thread
              markWord new_header(header.value() | thread_ident);
              // debugging hint
              DEBUG_ONLY(entry->lock() - >set_displaced_header(markWord((uintptr_t) 0xdeaddead));If the markword of the lock object is in the anonymously biased state, and there is no thread contention and the lock object directly changes from the anonymously biased state to the current thread, the CAS operation is performed with the Old value of the anonymously biased state markword
              if (lockee->cas_set_mark(new_header, header) == header) {
                if (PrintBiasedLockingStatistics)
                  (* BiasedLocking::anonymously_biased_lock_entry_count_addr()) + +; }// If it is not anonymous bias or lock contention, the lock upgrade process is entered
              else {
                CALL_VM(InterpreterRuntime::monitorenter(THREAD, entry), handle_exception);
              }
              success = true; }}// traditional lightweight locking
          if(! success) {// If bias mode or no lock is not enabled, lightweight lock upgrade is performed
            markWord displaced = lockee->mark().set_unlocked(a);// Set the current stack Lock Recod to no Lock state
            entry->lock() - >set_displaced_header(displaced);
            // -xx :+UseHeavyMonitors, then call_VM =true, disabling biased and lightweight locks
            bool call_vm = UseHeavyMonitors;
            // If heavyweight locks are enabled, the locks upgrade directly to monitorenter, or lightweight locks, CAS points the markword of the current object to the Lock Record
            if (call_vm || lockee->cas_set_mark((markOop)entry, displaced) ! = displaced) {// Is it simple recursive case? If CAS is not successful, it could be lightweight lock reentrant or contention
              if(! call_vm && THREAD->is_lock_owned((address) displaced->clear_lock_bits())) {
                Function of product reentrant Hermite Hermite is set to Null for function of reentrant Hermite Hermite
                entry->lock() - >set_displaced_header(NULL);
              } else {
                // Otherwise the lock is upgraded
                CALL_VM(InterpreterRuntime::monitorenter(THREAD, entry), handle_exception); }}}UPDATE_PC_AND_TOS_AND_CONTINUE(1.- 1);
        } else {
          // Lock Record is not enough, re-execute
          istate->set_msg(more_monitors);
          UPDATE_PC_AND_RETURN(0); // Re-execute}}Copy the code

InterpreterRuntime. CPP interpreterRuntime: : monitorenter lock escalation process:

IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorenter(JavaThread* thread, BasicObjectLock* elem))
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
  if (PrintBiasedLockingStatistics) {
    Atomic::inc(BiasedLocking::slow_path_entry_count_addr());
  }
  Handle h_obj(thread, elem->obj());
  assert(Universe::heap() - >is_in_reserved_or_null(h_obj()),
         "must be NULL or an object");
  if (UseBiasedLocking) {
    // Retry fast entry if bias is revoked to avoid unnecessary inflation
    ObjectSynchronizer::fast_enter(h_obj, elem->lock(), true, CHECK);
  } else {
    ObjectSynchronizer::slow_enter(h_obj, elem->lock(), CHECK);
  }
  assert(Universe::heap() - >is_in_reserved_or_null(elem->obj()),
         "must be NULL or an object");
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
IRT_END
Copy the code

Markword

When a thread attempts to access a synchronized block of code, it must first acquire the lock and release it when it exits or throws an exception. So where does the lock actually exist? What information is stored inside the lock?

Synchronized locks are stored in Java object headers. In HotSpot VIRTUAL machines, objects can be divided into three areas in memory layout: object headers, instance data, and aligned fill data. The object headers include Mark Word and class Pointer (Klass Pointer).

The HotSpot VIRTUAL machine divides The Mark Word into several bits and assigns different meanings to the bits in different object states. The following figure shows the meaning of each bit interval of mark Word in different object states on a 32-bit virtual machine (hotspot) :

When the object is in biasable state, mark Word stores the ID of the biased thread. In Lightweight locked state, Mark Word stores a pointer to the Lock Record in the thread stack. When the state is Heavyweight lock, it is a pointer to a Monitor object in the heap.

Synchronized optimization is closely related to MarkWord. You can query detailed object storage structure layout information through the JOL toolkit provided by OpenJDK.

<dependency>
    <groupId>org.openjdk.jol</groupId>
    <artifactId>jol-core</artifactId>
    <version>0.9</version>
</dependency>
 public static void main(String[] args) {
        Object o = new Object();
        System.out.println(ClassLayout.parseInstance(o).toPrintable());
}
Copy the code

Synchronized lock optimization

Biased locking

In Java, there are many thread-safe tools that use Synchronized modifications, but in practice, most operations are performed by the same thread, such as the append() method of a StringBuffer, which is modified by Synchronized. It is likely that only one thread will call the relevant synchronization method, demo:

public static void main(String[] args) {
    StringBuffer sb = new StringBuffer();
    for (int i = 0; i < 100; i++) {
       sb.append("test:"+ i); }}Copy the code

In JDK1.6, biased locking is introduced to improve the efficiency of only one thread accessing some synchronous operations at a time. When the lock is acquired for the first time, only one CAS operation will set the current thread ID in the markword of the lock object, and then the same thread will acquire the lock again, only a few simple commands will be executed. Instead of the CAS command, which is relatively expensive.

Anonymous BiasedLock

When the JVM has biased locking enabled (default above 1.6), when a new object is created, the mark Word of the newly created object will be biased if the class to which the object belongs has biased locking disabled (HasHCode has not been computed). In this case, the Thread ID in mark Word (see mark Word format in biased state above) is 0, indicating that no thread is biased to any thread. This is also called Anonymously biased.

If the object’s hashCode has been computed, the object cannot enter the bias state!

Where does hashCode for lightweight locks exist for heavyweight locks?

Answer: in a thread stack, in a Lock Record for a lightweight Lock, or in a member of an ObjectMonitor that represents a heavyweight Lock

Partial locking locking and unlocking

According to the source code analysis bytecodeinterpreter.cpp above, the biased lock and biased thread information are stored in the Java object header (Markword). The biased lock locking process can be divided into the following steps:

In the thread stack space to find free LockRecord store Lock object information (BasicObjectLock), can be used to determine whether bias thread is still in the synchronization code block, through traversing all thread stack space to determine whether LockRecord is Null;

Determine whether the bias mode is enabled on the head of the current lock object. If the bias mode is enabled, determine whether the bias mode is the current thread. If yes, the lock re-enters nothing to do, and the anonymous bias, then CAS will be tried to set the bias thread. So the overhead of the same thread repeatedly entering the same object’s critical section is very small.

If the current lock has to other threads | | epoch value date | | | bias mode off | biased locking concurrency conflicts exist in the process, will enter the InterpreterRuntime: : monitorenter method, in the method of biased locking and upgrade.

If bias mode is not enabled, lightweight lock CAS manipulation is directly entered;

Biased locking of the realization of the undo and upgrade in interpreterRuntime. CPP interpreterRuntime: : monitorenter method:

IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorenter(JavaThread* thread, BasicObjectLock* elem))
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
  if (PrintBiasedLockingStatistics) {
    Atomic::inc(BiasedLocking::slow_path_entry_count_addr());
  }
  Handle h_obj(thread, elem->obj());
  assert(Universe::heap() - >is_in_reserved_or_null(h_obj()),
         "must be NULL or an object");
  // If bias locking is on -xx: -usebiasedlocking =true
  if (UseBiasedLocking) {
    // Retry fast entry if bias is revoked to avoid unnecessary inflation
    ObjectSynchronizer::fast_enter(h_obj, elem->lock(), true, CHECK);
  } else {
    ObjectSynchronizer::slow_enter(h_obj, elem->lock(), CHECK);
  }
  assert(Universe::heap() - >is_in_reserved_or_null(elem->obj()),
         "must be NULL or an object");
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
IRT_END
Copy the code

ObjectSynchronizer:: fast_Enter if bias locking is enabled, the ObjectSynchronizer::fast_enter method unlocks bias locking:

void ObjectSynchronizer::fast_enter(Handle obj, BasicLock* lock,
                                    bool attempt_rebias, TRAPS) {
  if (UseBiasedLocking) {
    if(! SafepointSynchronize::is_at_safepoint()) {
      BiasedLocking::Condition cond = BiasedLocking::revoke_and_rebias(obj, attempt_rebias, THREAD);
      if (cond == BiasedLocking::BIAS_REVOKED_AND_REBIASED) {
        return; }}else {
      assert(! attempt_rebias,"can not rebias toward VM thread");
      BiasedLocking::revoke_at_safepoint(obj);
    }
    assert(! obj->mark() - >has_bias_pattern(), "biases should be revoked by now");
  }

  slow_enter(obj, lock, THREAD);
}
Copy the code

If it is currently at the safe point, it is revoked by the VM thread. Otherwise, it is revoked in the BiasedLocking::revoke_and_rebias method. During the revoking process, it is checked again whether the Epoch has expired. = if the object header is epoch and rebias is allowed, Cas directly modifs the lock object’s Markword to bias the current thread. Otherwise, the lock status is revoked.

Other situations to other threads | | | anonymous towards competition | CAS failure can wait until safe point, call revoke_bias method, The validation bias thread traverses all threads in the JVM to see if it is still alive and in the synchronized block (traverses BasicObjectLock in the Lock Record in the thread stack), upgrades to a lightweight Lock in the synchronized block, otherwise the bias Lock is revoked to a lock-free or anonymous bias state. Finally, go to slow_Enter and try to upgrade the lightweight lock.

Safe Point is a term we use a lot in GC, and it represents a state in which all threads are paused.

Batch Rebias and Batch Undo (Epoch)

Batch heavy bias and batch cancel origin: From biased locking lock unlock process can be seen, when only one thread enters the synchronized block repeatedly, biased locking performance costs in basic can be ignored, but when there are other threads trying to obtain the lock, need to wait until the safe point, then biased locking revocation is unlocked state or upgrade is lightweight, will consume a certain amount of performance, Therefore, in the case of frequent multi-thread competition, biased locking can not only improve performance, but also lead to performance degradation. Thus, there is a batch rebias and batch undo mechanism.

Bulk Rebias is maintained for each Klass. The Klass object maintains a bias rebias counter, biASED_LOCK_REVOCation_count, every time a bias rebias occurs on the Klass object. This counter + 1, when the value reached to heavy bias threshold BiasedLockingBulkRebiasThreshold 20 (the default), will be a bulk weight bias. Each lock object will have an Epoch field, and the Klass object _displaced_header will also have an Epoch. The initial value of the Epoch in the lock object will be the Klass Epoch value when the object was created. When the batch rebias occurs, the Klass Epoch value will be +1. At the same time, the stack of all threads in the JVM is iterated, all biased locks of the class being locked are found, and their EPOCH field is changed to the new value. The next time the lock is acquired. If the epoch value of the current object is found to be different from that of Klass, it means that the biased Thread has exited the synchronized code block (cannot find the Lock Record to perform the +1 operation), and directly changes its Mark Word Thread Id to the current Thread Id through CAS operation. When the bias lock revocation counter _biASed_LOCK_REVOCation_count reaches the batch undo threshold (40 by default), the JVM considers the Klass usage scenario to be multithreaded and marks the _displaced_header as unbiased. For the Klass instance lock, go straight to the lightweight lock logic.

BiasedLocking::revoke_and_rebias:

// Batch rebias and batch undo logic
static HeuristicsResult update_heuristics(oop o, bool allow_rebias) {
  markOop mark = o->mark(a);if(! mark->has_bias_pattern()) {
    return HR_NOT_BIASED;
  }
  Klass* k = o->klass(a); jlong cur_time = os::javaTimeMillis(a); jlong last_bulk_revocation_time = k->last_biased_lock_bulk_revocation_time(a);int revocation_count = k->biased_lock_revocation_count(a);if((revocation_count >= BiasedLockingBulkRebiasThreshold) && (revocation_count < BiasedLockingBulkRevokeThreshold) && (last_bulk_revocation_time ! =0) &&
      (cur_time - last_bulk_revocation_time >= BiasedLockingDecayTime)) {
    k->set_biased_lock_revocation_count(0);
    revocation_count = 0;
  }
  // Make revocation count saturate just beyond BiasedLockingBulkRevokeThreshold
  if (revocation_count <= BiasedLockingBulkRevokeThreshold) {
    revocation_count = k->atomic_incr_biased_lock_revocation_count(a); }if (revocation_count == BiasedLockingBulkRevokeThreshold) {
    return HR_BULK_REVOKE;
  }

  if (revocation_count == BiasedLockingBulkRebiasThreshold) {
    return HR_BULK_REBIAS;
  }
  return HR_SINGLE_REVOKE;
}
Copy the code

When a batch rebias occurs, all threads of the JVM are iterated at Safe Point, the Lock Rocord of the thread points to the epoch value of the Lock object and the EPOCH value of the Klass object are modified, and rebias is attempted. When a batch undo occurs, the bias flag in the class is closed and the stack of all threads is iterated. Undo bias for all locks of this class.

Deflection lock delay

Default bias lock has a delay, default is 4 seconds why? Because the JVM has its own threads started by default, there is a lot of sync code in the sync code. The sync code starts with the knowledge that there will be contention. If biased locks are used, biased locks will constantly revoke and upgrade locks, which is inefficient.

- XX: BiasedLockingStartupDelay = 0 # set time delayCopy the code

Lightweight lock

When biased locks are contended or biased mode is closed, they are upgraded to lightweight locks, which are optimistic locks that use CAS to modify the Lock object to point to the Lock Record pointer in the thread stack. If this update action succeeds, the thread owns the Lock on the object. In addition, the lock bit of the Mark Word object (the last 2bit of the Mark Word) will be changed to 00, indicating that the object is in the lightweight lock state, which does not involve the operating system level. Implemented in user space, the lightweight lock has better performance than the traditional heavyweight mutex lock (using the operating system mutex to lock).

Locking steps

Before locking, the virtual machine needs to create space for a Lock Record in the stack frame of the current thread without bias. The Lock Record, implemented as BasicObjectLockBasic, contains the Lock object and a _displaced_header property that stores a copy of the Mark Word for the Lock object.

Copy the Mark Word of the lock object into the _displaced_header property of the lock record called product Mark Word.

Finally, the CAS (Lock Record, _displaced_header) operation points the Mark word of the Lock object to the Lock Record. If the CAS fails, it determines whether the Lock is reentrant, determines the Lock state, determines whether the lightweight Lock points to the Lock Reocrd in the current thread stack. Otherwise the entry lock expands to a heavyweight lock;

unlock

Replace the product of the current thread’s product Mark Word back to the lock object with CAS operation. If the replacement is successful, the product will be unlocked successfully.

The replacement failed. The lightweight lock expanded to a heavyweight lock and was unlocked again

ObjectSynchronizer::inflate(THREAD, object)->exit (true, THREAD) ;
Copy the code

Lightweight lock flag bit

In the process of lightweight locking, the mark word of the Lock object is set as a pointer to the Lock Record. There is no operation to change the Lock state to 00.

In lightweight locking, the entire Mark word of the direct Lock object is changed to point to the start address of the Lock Record. The JVM requires that the start address of the object be an integer multiple of 8 bytes, so this address must be 00. When the JVM sets the Mark Word to the address of the LockRecord, the lock flag bit is also set.

Heavyweight lock ObjectMoniter

In the Java virtual machine (HotSpot), Monitor is implemented based on C++ and implemented by ObjectMonitor. Its main data structure is as follows:

ObjectMonitor() {
    _header       = NULL;
    _count        = 0;
    _waiters      = 0,
    _recursions   = 0;
    _object       = NULL;
    _owner        = NULL;
    _WaitSet      = NULL;
    _WaitSetLock  = 0 ;
    _Responsible  = NULL ;
    _succ         = NULL ;
    _cxq          = NULL ;
    FreeNext      = NULL ;
    _EntryList    = NULL ;
    _SpinFreq     = 0 ;
    _SpinClock    = 0 ;
    OwnerIsThread = 0 ;
  }
Copy the code

There are several key properties in ObjectMonitor:

_owner: points to the thread holding the ObjectMonitor object

_WaitSet: stores the queue of threads in wait state

_EntryList: stores the queue of threads in the lockwaiting block state

_recursions: number of lock reentrant times

_CXq: one-way list when multiple threads are competing

_count: indicates the number of threads to preempt the lock

For lightweight lock fails, will lock expansion for the heavyweight lock first, for each object to create their own monitoring lock ObjectMonitor object and initializes, implementation code ObjectSynchronizer: : inflate as follows:

ObjectMonitor* ObjectSynchronizer::inflate(Thread * Self,
                                                     oop object,
                                                     const InflateCause cause) {
  assert(Universe::verify_in_progress() | |! SafepointSynchronize::is_at_safepoint(), "invariant");

  EventJavaMonitorInflate event;

  for (;;) {
    const markOop mark = object->mark(a);assert(! mark->has_bias_pattern(), "invariant");
    // It is already in the heavyweight lock state
    if (mark->has_monitor()) {
      ObjectMonitor * inf = mark->monitor(a);assert(inf->header() - >is_neutral(), "invariant");
      assert(oopDesc::equals((oop) inf->object(), object), "invariant");
      assert(ObjectSynchronizer::verify_objmon_isinpool(inf), "monitor is invalid");
      return inf;
    }
    //INFLATING (INFLATING) continue Retry
    if (mark == markOopDesc::INFLATING()) {
      ReadStableMark(object);
      continue;
    }
    // Allocate an ObjectMonitor object and initialize the value
    if (mark->has_locker()) {
      ObjectMonitor * m = omAlloc(Self);
      m->Recycle(a); m->_Responsible =NULL;
      m->_recursions   = 0;
      m->_SpinDuration = ObjectMonitor::Knob_SpinLimit;   // Consider: maintain by type/class
	 // Set the lock object's Mark word to INFLATING (0)
      markOop cmp = object->cas_set_mark(markOopDesc::INFLATING(), mark);
      if(cmp ! = mark) {omRelease(Self, m, true);
        continue;       // Interference -- just retry
      }
      markOop dmw = mark->displaced_mark_helper(a);assert(dmw->is_neutral(), "invariant");

      // Setup monitor fields to proper values -- prepare the monitor
      m->set_header(dmw);
      m->set_owner(mark->locker());
      m->set_object(object);
      // Set the lock object header to the heavyweight lock state
      guarantee(object->mark() == markOopDesc::INFLATING(), "invariant");
      object->release_set_mark(markOopDesc::encode(m));

      OM_PERFDATA_OP(Inflations, inc());
      if (log_is_enabled(Debug, monitorinflation)) {
        if (object->is_instance()) {
          ResourceMark rm;
          log_debug(monitorinflation)("Inflating object " INTPTR_FORMAT " , mark " INTPTR_FORMAT " , type %s".p2i(object), p2i(object->mark()),
                                      object->klass() - >external_name()); }}if (event.should_commit()) {
        post_monitor_inflate_event(&event, object, cause);
      }
      return m;
    }
Copy the code

Synchronized there is a for loop in the inflate, mainly to handle the situation where multiple threads are simultaneously invoking the inflate, and the lock object header is upgraded to a weight lock and goes into ObjectMonitor:: Enter to retrieve the lock:

void ATTR ObjectMonitor::enter(TRAPS) {
  Thread * const Self = THREAD ;
  The CAS operation modifies _owner to the current thread
  void * cur = Atomic::cmpxchg(Self, &_owner, (void*)NULL);
  if (cur == NULL) {// If the setting succeeds, the Monitor is not occupied
     assert (_recursions == 0   , "invariant");assert (_owner      == Self, "invariant");return ;
  }
  //CAS failed, but _owner is the current thread, lock reentrant
  if (cur == Self) {
     _recursions ++ ;// The reentrant count increases by 1
     return ;
  }
  // Cur is the Lock Record pointer in the stack of the current thread, set owner to the current thread,
  if (Self->is_lock_owned ((address)cur)) {
    assert (_recursions == 0."internal state error");
     _recursions = 1;
    _owner = Self;
    return ;
  }
  / / to omit... Execute ObjectMonitor::EnterI method, spin, etc
}
Copy the code

Before operating system synchronization, CAS directly attempts to set the lock status and spin of monitor to obtain the lock. If the lock fails, ObjectMonitor::EnterI method is used to obtain the lock or block the lock. Procedure for obtaining the lock:

When multiple threads simultaneously access a piece of synchronized code, try CAS to set _Owner to the current thread and obtain the object lock if successful. Otherwise, try spinning a certain number of times to lock and avoid using the operating system Mutex lock and blocking the thread to the greatest extent.

If the lock cannot be acquired, the thread is wrapped as an ObjectWaiter object and inserted at the head of the _CXQ queue, and the park function is called to suspend the current thread. On Linux, the underlying park function calls the pthread_cond_wait of the GCLIb library and attempts to acquire the lock after being awakened

Release the lock:

void ObjectMonitor::exit(bool not_suspended, TRAPS) {
  Thread * const Self = THREAD;
  If _owner is not the current thread
  if(THREAD ! = _owner) {// The current thread is the thread that previously held the lightweight lock. The Enter method has not been called since the lightweight Lock bloat, _owner will be a pointer to the Lock Record.
    if (THREAD->is_lock_owned((address) _owner)) {
      // Transmute _owner from a BasicLock pointer to a Thread address.
      // We don't need to hold _mutex for this transition.
      // Non-null to Non-null is safe as long as all readers can
      // tolerate either flavor.
      assert(_recursions == 0."invariant");
      _owner = THREAD;
      _recursions = 0;
    } else {
      assert(false."Non-balanced monitor enter/exit! Likely JNI locking");
      return; }}if(_recursions ! =0) {
    _recursions--;        // this is simple recursive enter
    return;
  }

  // Invariant: after setting Responsible=null an thread must execute
  // a MEMBAR or other serializing instruction before fetching EntryList|cxq.
  _Responsible = NULL;

#if INCLUDE_JFR
  // get the owner's thread id for the MonitorEnter event
  // if it is enabled and the thread isn't suspended
  if (not_suspended && EventJavaMonitorEnter::is_enabled()) {
    _previous_owner_tid = JFR_THREAD_ID(Self);
  }
#endif

    for (;;) {
        assert(THREAD == _owner, "invariant");
        // Release the lock first, and then acquire the lock if another thread enters the block
        OrderAccess::release_store(&_owner, (void*)NULL);   // drop the lock
        OrderAccess::storeload(a);// See if we need to wake a successor
        // If there is no thread waiting or there is already a thread acquiring the lock
        if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0|| _succ ! =NULL) {
            return;
        }
       	// To perform the subsequent operation, the lock needs to be reacquired in preparation for the heir presumptive
        if(! Atomic::replace_if_null(THREAD, &_owner)) {
            return;
        }

        guarantee(_owner == THREAD, "invariant");

        ObjectWaiter * w = NULL;
	    // The thread in _EntryList has a higher priority and wakes up the first queue thread in _EntryList
        w = _EntryList;
        if(w ! =NULL) {
            assert(w->TState == ObjectWaiter::TS_ENTER, "invariant");
            ExitEpilog(Self, w);
            return;
        }

        If _cxq and EntryList are null then just re-loop to exit
        w = _cxq;
        if (w == NULL) continue;

        // add _CXq to _EntryList in batches and set _CXq to NUll
        for (;;) {
            assert(w ! =NULL."Invariant");
            ObjectWaiter * u = Atomic::cmpxchg((ObjectWaiter*)NULL, &_cxq, w);
            if (u == w) break;
            FastScanClosure = u;
        }

        assert(w ! =NULL."invariant");
        assert(_EntryList == NULL."invariant");

        _EntryList = w;
        ObjectWaiter * q = NULL;
        ObjectWaiter * p;
        // Move elements from CXQ to EntryList
        for(p = w; p ! =NULL; p = p->_next) {
            guarantee(p->TState == ObjectWaiter::TS_CXQ, "Invariant");
            p->TState = ObjectWaiter::TS_ENTER;
            p->_prev = q;
            q = p;
        }
         // _succ is not null, indicating that there is already an heir, so there is no need for the current thread to wake up, reducing the context switch rate
        if(_succ ! =NULL) continue;

        w = _EntryList;
        if(w ! =NULL) {
            // Wake up the first EntryList element
            guarantee(w->TState == ObjectWaiter::TS_ENTER, "invariant");
            ExitEpilog(Self, w);
            return; }}Copy the code

When the lock is released, the default strategy is to insert the CXQ elements into the EntryList in the original order if EntryList is empty and wake up the first thread. That is, when the EntryList is empty, the next line is retrieved first. If the thread holding Monitor calls wait(), the currently held monitor is released, the _owner variable is restored to null, the _count is reduced by 1, and the thread enters the _WaitSet collection to be awakened. If the current thread completes, it also releases the monitor(lock) and resets the value of the variable so that another thread can enter to acquire the monitor(lock).

Lock elimination

lock coarsening

public void add(String str1,String str2){
         StringBuffer sb = new StringBuffer();
         sb.append(str1).append(str2);
}
Copy the code

We all know that StringBuffer is thread-safe because its key methods are modified by synchronized, but if we look at the code above, we can see that the reference to sb is only used in the add method and cannot be referenced by other threads (because it is local and the stack is private). So sb is an impossible resource to share, and the JVM automatically removes the lock inside the StringBuffer object.

Lock coarsening

lock coarsening

public String test(String str){
       
       int i = 0;
       StringBuffer sb = new StringBuffer():
       while(i < 100){
           sb.append(str);
           i++;
       }
       return sb.toString():
}
Copy the code

When the JVM detects that a sequence of operations is locking the same object (append 100 times in the while loop, lock/unlock 100 times without lock coarsening), the JVM coarsenes the scope of the lock outside of the sequence of operations (such as while Unreal outside). The sequence of operations requires only one lock.

Lowest level implementation

Use HSDIS to view the sink code

-xcom: Causes the JVM to execute code in compile mode, that is, the JVM compiles all bytecode into native code on the first run

– XX: + UnlockDiagnosticVMOptions: unlock diagnostic function

-xx :+PringAssembly: outputs assembly instructions after disassembly

Lock CMPXCHG instruction, the final implementation

LOCK

LOCK is used for exclusive use of shared memory when executing instructions in multiple processors.

Its function is to refresh the contents of the current processor’s corresponding cache to memory, and invalidate the corresponding cache of other processors.

It also provides the effect that ordered instructions cannot cross this memory barrier.

Synchronized underlying use of the Lock instruction, so natural visibility, sequence, atomic.

END

Synchronized in Java has three forms of biased lock, lightweight lock and heavyweight lock, respectively corresponding to the lock is only held by one thread, different threads alternately hold the lock, multi-thread competition lock. When conditions are not met, locks are upgraded in the order of biased -> lightweight -> heavyweight.