There are two bases for implementing synchronized: Java object headers and Monitor. The virtual machine specification defines the layout of objects in memory, which consists of the following three parts:

  • Object head
  • The instance data
  • Alignment filling

The implementation of synchronized is hidden in the object header. The object header consists of two important parts:

  • Mark Word: The default storage object hashCode, generation age, lock type, lock flag bit and other information, is the key to achieve lightweight lock and biased lock

  • Class Metadata Address: A type pointer to the Class Metadata of an object that the JVM uses to determine which Class the object belongs to

The diagram below shows the composition of Mark Word on a 32-bit machine. Prior to Java6, synchronized implementations relied on heavyweight locking, with the lock flag bit 10. Synchronized was optimized after Java6 to add lightweight and biased locks


Click here to view the source code

  ObjectMonitor() {
    _header       = NULL;
    _count        = 0;
    _waiters      = 0,
    _recursions   = 0;
    _object       = NULL;
    _owner        = NULL;
    _WaitSet      = NULL;
    _WaitSetLock  = 0 ;
    _Responsible  = NULL ;
    _succ         = NULL ;
    _cxq          = NULL ;
    FreeNext      = NULL ;
    _EntryList    = NULL ;
    _SpinFreq     = 0 ;
    _SpinClock    = 0 ;
    OwnerIsThread = 0 ;
    _previous_owner_tid = 0;
  }
Copy the code

Above is the initialization code for the ObjectMonitor class. You can see that there are _WaitSet and _EntryList, which correspond to the wait pool and lock pool in Java

  • _ownerIt’s holdingobjectMonitorThe thread. When multiple threads access the synchronization code of an object at the same time, the object is first entered_EntryListWhen the thread gets the object’sMonitorAfter, I putMonitorIn theownerSet bit for the current thread whileMonitorIn the_countVariable + 1
  • If you holdMonitorThe thread completes execution or is calledwait()Method will release the holdingMonitor.ownerWill be set toNULLAnd count-1. The current thread will enter_WaitSetWaiting to be awakened

The Monitor object exists in the object header of every object, and synchronized locks by holding the Monitor, which is why any object in Java can be used as a lock.

The following is an analysis of synchronized keyword from the bytecode level

Public class SyncBlockAndMethod {public void syncTask() {// synchronized (this) {system.out.println ("Hello"); Public void syncMethod() {system.out.println ("Hello Again"); }}Copy the code

Synchronized code blocks and synchronized methods are defined in the class, which are compiled into bytecode using Javac

javac SyncBlockAndMethod.java
Copy the code

You then use Javap to view the bytecode generated by the compilation

javap -verbose SyncBlockAndMethod
Copy the code

First, look at the bytecodes associated with synchronized modifies blocks of synchronized code

public void syncTask(); descriptor: ()V flags: ACC_PUBLIC Code: stack=2, locals=3, args_size=1 0: aload_0 1: dup 2: astore_1 3: monitorenter 4: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 7: ldc #3 // String Hello 9: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;) V 12: aload_1 13: monitorexit 14: goto 22 17: astore_2 18: aload_1 19: monitorexit ... . .Copy the code

From the bytecode, we know that the implementation of the synchronized block corresponds primarily to the Monitorenter and Monitorexit directives. Monitorenter is the entry to the synchronized code block and indicates where the lock is acquired. Monitorexit is the exit to the synchronized code block where the lock is released.

Observant students will see that the bytecode contains one Monitorenter directive and two Monitorexit directives. The first Monitorexit directive corresponds to the Monitorenter directive.

To ensure that the lock is still released when an exception is thrown in a synchronized block, the compiler automatically generates an exception handler that releases the lock when an exception is thrown in a synchronized block, corresponding to the second Monitorexit instruction in the bytecode.

Let’s look at the bytecodes associated with synchronized modifiers

public synchronized void syncMethod(); descriptor: ()V flags: ACC_PUBLIC, ACC_SYNCHRONIZED Code: stack=2, locals=1, args_size=1 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #5 // String Hello Again 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;) V 8: return LineNumberTable: line 13: 0 line 15: 8Copy the code

Monitorenter and Monitorexit directives are not seen above, because monitor’s implementation is implicit when it comes to synchronized modifiers, adding the ACC_SYNCHRONIZED flag bit to the method flags, Synchronized methods are implemented in this way. A monitor object is acquired when a thread enters a synchronized method and released when an exception is thrown in the method body or execution is complete.

What is reentrant?

Mutexes are designed to block when a thread view manipulates a critical resource of an object lock held by another thread. However, when a thread requests to hold the critical resource of the object lock again, this situation is reentrant and the request can be successful. For example, a thread that has acquired a lock and entered a synchronized method can successfully call another synchronized method on that object from within the method.

In early versions, synchronized belongs to weight Lock, and Monitor relies on Mutex Lock of the underlying operating system to realize that the thread will switch after acquiring the Lock, while the switch between threads requires user mode conversion to and mentality, which costs a lot after Java6. The JVM(Hospot) level has made a large optimization of synchronized, and the performance of synchronized has been greatly improved, including the following:

  • Adaptive Spinning(Adaptive spin locking)
  • Lock Eliminate
  • Lock Coarsening
  • Lightweight Locking
  • Biased Locking

These optimizations are designed to improve application efficiency by sharing data more efficiently between threads and resolving contention issues

spinlocks

In many cases, the locked state of shared data only lasts for a short period of time, which is not worth switching threads. In the case of multi-core cpus, it is perfectly possible to have another thread that has not acquired the lock wait for the lock to be released by executing a busy loop rather than switching threads to give up the CPU. This is called spin locking. It was introduced in Java4, but was turned off by default, and only became enabled by default in Java6. If other threads hold the lock for a very short period of time, the spin will acquire the lock quickly, and the performance of the spin lock will be good; However, if the lock is held by another thread for a long time, the performance cost is high because the spin is a loop, and if the loop is long, the CPU is wasted. So if the lock is not acquired after a certain number of spins, suspend and switch threads in the traditional way, using the preBlockSpin parameter to set the spin count. However, it is difficult to determine the number of spins in different scenarios, so adaptive spin locking appears

Adaptive spin lock

In adaptive spin locks, the number of spins is no longer fixed, but is determined by the last spin time on the same lock and the state of the lock owner. If, on a lock object, a thread has just acquired the lock through spin, and the thread holding the lock is running, the JVM assumes that it has a high chance of acquiring the lock through spin and will increase the spin count the next time another thread obtains the lock. If, on the other hand, a lock is rarely spun successfully, then the spin process is skipped to avoid wasting processor resources when the lock is subsequently acquired.

Lock elimination

Lock elimination is also a type of lock optimization. During JIT compilation, the JVM scans the running context to remove locks that are unlikely to be contested, which can save time for meaningless lock requests and improve program performance. In the following example,StringBuffer is thread-safe, the append() method is modified with the synchronized keyword, but since the SB object is only called inside the add() method, it is a local variable, There is no way that B can be referenced by another thread, so the SB object is a resource that cannot be shared, and the JVM automatically removes the lock on the sb.append() method

public class StringBufferWithoutSync { public void add(String str1, String str2) { StringBuffer sb = new StringBuffer();  //StringBuffer is thread-safe, and the append() method is decorated with the synchronized keyword. // However, since sb objects are only called within the add() method, they are local variables and cannot be referenced by other threads. Sb.append (str1).append(str2); } public static void main(String[] args) { StringBufferWithoutSync withoutSync = new StringBufferWithoutSync(); for (int i = 0; i < 100; i++) { withoutSync.add("a", "b"); }}}Copy the code

Lock coarsening

Avoid repeatedly locking and releasing locks by expanding the range of locking

public class CoarseSync { public static String copyString100Times(String target) { StringBuffer sb = new StringBuffer();  for (int i = 0; i < 100; I++) {// since the append method is synchronized. The JVM detects that the lock is being added and released in the loop, and the lock is coarsed to the outside of the loop. This improves performance. } return sb.toString(); }}Copy the code

In the above code, the append method is synchronized. Each time the AppEnd requests a lock, the JVM detects that lock and lock release operations in the loop are time-consuming and coarses the lock to the outside of the loop and only adds the lock once, which improves performance.

The four states of synchronized

No lock, bias lock, lightweight lock, heavyweight lock. Will gradually upgrade with competition, lock expansion direction: lock -> bias -> lightweight -> heavyweight lock.

Lock degradation occurs in certain cases, and when the JVM runs to a Safe Point, it checks for idle Monitors and attempts to degrade them.

If there is no lock, the lock above is removed. If there is no lock, the lock above is removed.

Biased locks appear to reduce the cost of acquiring locks for the same thread. In most cases, locks are not contested by multiple threads and are always acquired multiple times by the same thread. Biased locking is introduced to reduce the cost of acquiring locks for the same thread.

The core idea of biased lock is: If a thread obtains the lock, then the lock enters the biased mode, and the structure of Mark Word changes to the biased lock structure. When the thread clears the lock again, no synchronization operation is needed, that is, the process of acquiring the lock only needs to check the lock marker bit of Mark Word. And ThreadID with the current ThreadID equal to Mark Word, which saves a lot of lock requests. When a thread accesses the block and obtains the lock, it will store the thread ID of lock bias in the lock record in the object header and stack frame. When the thread enters and pushes the block again, it does not need CAS operation to lock and unlock, thus improving the performance of the program. In the case of no lock competition, biased locking has a good optimization effect. However, in multi-threaded situations where lock competition is fierce, biased lock will lose its effect, and then biased lock will be upgraded to lightweight lock.

Lightweight lock

Lightweight locks are upgraded from bias locks, which operate when one thread enters a synchronized block and then upgrade to lightweight locks when a second thread joins the lock contention. Lightweight locks are applicable when threads alternately execute synchronized blocks. If the same lock is accessed at the same time, the lightweight lock expands to a heavyweight lock.

  1. When the thread enters the synchronization code block, if the synchronization object Lock state is no Lock state (the Lock flag bit is “01” state), the VIRTUAL machine first establishes a space named Lock Record in the current thread stack frame, which is used to store the current copy of the Lock object Mark Word. Product of product product product is called product of product product product. The function of product product is called product of product product product.


  1. Copy the Mark Word from the object header into the stack frame’s key record
  2. After the copy is successful, the VM will use CAS to try to update the Mark Word of the object to a pointer to Lock Record, and set the owner pointer in Lock Record to the Mark Word of the object. If the change succeeds, go to Step 4. Otherwise, go to Step 5
  3. If the update in Step 3 succeeds, the thread owns the lock on the object, and the object’s Mark Word lock bit is set to “00”, indicating that the object is in a light lock state. The state of the thread stack frame and the object header is shown below:


Now let’s talk about the process of lightweight lock unlocking.

Unlocking process:

  1. Try to replace the product copied in the thread stack frame with the current Mark Word of the object by CAS operation
  2. If the replacement succeeds, the synchronization process is complete
  3. If the replacement fails, another thread will attempt to acquire the lock (which has ballooned to a heavyweight lock), and the suspended thread will be awakened at the same time the lock is released

The product of herhery needs to be explained why the product of herhery’s product is successfully replaced by the Mark Word of the object. This is understood in terms of the memory semantics of locks

Memory semantics of locks

  • When a thread releases a lock, the Java memory model flusher shared variables from the thread’s local memory to main memory
  • When a thread acquises a lock, the Java memory model invalidates the thread’s corresponding local memory, so that critical section code protected by the monitor must read shared variables from main memory

In other words, when thread A releases A lock, thread A actually sends A message to the next thread that will acquire the lock, which is thread A’s modification of the shared variable. When thread B acquires a lock, thread B actually receives a message from the thread that previously held the lock, which is a modification to the shared variable before the lock is released.

The lock advantages disadvantages Usage scenarios
Biased locking Locking and unlocking require no CAS operation, no additional performance cost, and a nanosecond difference compared to performing asynchronous methods If there is lock contention between threads, there is additional lock cancellation cost Scenarios where only one thread has access to synchronized code blocks or synchronized methods
Lightweight lock Competing threads do not block, increasing speed accordingly If a thread does not preempt the lock for a long time, spin can cost CPU performance A scenario in which threads alternately execute method synchronization blocks or methods
Heavyweight lock Thread contention does not use spin and does not consume CPU Threads are blocked and response time is slow. In multithreading, locking is acquired and released frequently, resulting in a significant performance cost Throughput, synchronous code blocks, or scenarios where synchronous methods take a long time to execute
  • Reference: lightweight lock plus lock & unlock process

] (gorden5566.com/post/1019.h…).





If you find this article helpful, please like it and I will be more motivated to write good articles.

My article will be first published on the official account, please scan the code to follow my official account Zhang Xian.