“This is the second day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

Hello, I’m looking at the mountains.

Synchronized is a built-in Java lock implementation. A keyword is used to lock shared resources. Synchronized has three usage scenarios, and the locking objects are different according to different scenarios:

  1. Normal method: The lock object is the current instance object
  2. Static methods: The lock object is the Class object of the Class
  3. Method block: Lock objects are objects in synchronized parentheses

Implementation principle of synchronized

Synchronized implements the locking mechanism by entering and exiting Monitor objects, and the code block is implemented by a pair of Monitorenter/Monitorexit directives. After compilation, the MonitoRenter directive is inserted at the beginning of the synchronized code block, and the Monitorexit directive is inserted at the end of the method and the exception. The JVM ensures that Monitorenter and Monitorexit are paired. Any object has a Monitor associated with it, and if and only if a Monitor is held, it is locked.

When monitorenter executes, it first tries to acquire the lock on an object. If the object is not locked or the current thread holds the lock, the lock counter is incremented by one. Accordingly, when the Monitorexit directive is executed, decrement the lock counter by one. When the counter drops to 0, the lock is released. If the monitorenter fails to acquire the lock, the current thread is blocked until the object lock is released.

Prior to JDK6, the implementation of Monitor relied on the internal implementation of the operating system’s Mutex Lock (commonly used in the Mutex Lock implementation). Thread blocking would switch between user and kernel state, so synchronization was an undifferentiated heavyweight Lock.

Later updates to synchronized included a spin operation before the operating system blocked the thread in order to avoid switching between user and kernel mode when the thread blocked. Three different monitors are implemented: Biased Locking, Lightweight Locking, and heavyweight Locking. Synchronized performance has improved significantly since JDK6 and is not bad compared to ReentrantLock, but it is more flexible to use.

Adaptive Spinning

Synchronized has the greatest impact on the performance of the implementation of blocking. Both suspended threads and resumed threads need the help of the operating system, and the state transformation needs a lot of CPU time.

In most of our applications, shared data is locked for only a short period of time, which is not worth the time spent suspending and resuming threads. Moreover, most processors today are multi-core processors, so if you ask the next thread to wait for a while, do not release the CPU, wait for the first thread to release the lock, the next thread immediately acquire the lock to execute the task. This is called spin, and the thread executes a busy loop, turning around in place for a while to see if the lock is released. Once it is released, it gets the lock, and when it is not released, it turns around again.

Spinlocks were introduced in JDK 1.4.2 (opened with the -xx :+UseSpinning parameter) and are turned on by default in JDK 1.6. Spin locking is not a substitute for blocking because spin waiting, while avoiding the overhead of thread switching, consumes CPU time, which is fine if the lock is short, and is a performance waste if it is not. So adaptive spin locking was introduced in JDK 1.6: if the spin wait on the same lock object has just succeeded and the thread holding the lock is running, the spin wait is more likely to succeed, allowing the spin wait to last longer. On the other hand, if the spin rarely succeeds for a lock, then it is possible to simply omit the spin process to avoid wasting CPU resources.

Lock escalation

Java object head

The lock used by synchronized exists in the Java object head, and the data stored in the Mark Word in the object head will change with the change of the flag bit as follows:

Java object header Mark Word

Biased Locking

In most cases, there is no multi-thread competition for locks, and locks are always acquired by the same thread for many times. In order to make the cost of locking lower, biased locks are introduced.

When a thread, when accessing a synchronized block and obtain the lock will lock in the head and stack frame object record store to lock in the thread ID, after this thread synchronization block on the entry and exit for CAS operation is not needed to lock and unlock, simply test object head Mark in the Word is stored with biased locking points to the current thread. Biased locking is introduced in order to minimize unnecessary lightweight lock execution paths without multi-threaded contention, because the acquisition and release of lightweight locks depend on multiple CAS atomic instructions. The biased lock only needs to rely on the CAS atomic instruction once the ThreadID is replaced (since the biased lock must be revoked once the multithreading competition occurs, the performance loss of the cancellation operation of the biased lock must be less than that of the saved CAS atomic instruction).

Biased lock acquisition

  1. When the lock object is first acquired by the thread, the flag bit of the object header is set to 01 and the bias mode is set to 1, indicating that the lock object enters the bias mode.

  2. Test whether the thread ID points to the current thread, if so, execute the synchronized code block, if not, go to 3

  3. Use the CAS operation to record the thread ID of the acquired lock in the object’s Mark Word. If it succeeds, the synchronization code block is executed. If it fails, it indicates that another thread has held the biased lock of the lock object, and the current thread tries to acquire the biased lock

  4. When the global safe point is reached (no bytecode is executing), the thread with the bias lock is paused to check the thread status. If the thread is terminated, the object head is set to lock free (flag bit “01”), and then the new thread is rebiased; If the thread is still alive, it will revoke the bias lock and upgrade to the lightweight lock state (flag bit “00”). At this point, the lightweight lock is held by the thread that originally held the bias lock and continues to execute its synchronization code, while the competing thread spins to wait for the lightweight lock.

Biased lock release

Bias locks are released using a lazy release mechanism: bias locks are released only when a race occurs. The release process is step 4 above and will not be described here.

Closing bias lock

Skew locks are not suitable for all applications, and revoke is a heavy action that only improves when there are more synchronized blocks that do not really compete. The practice of skew locking has been controversial, and some even argue that when you use a lot of concurrent libraries, it usually means you don’t need skew locking.

So if you are sure that locks in your application are usually in contention, you can turn off biased locking with the JVM argument: -xx: -usebiasedLocking =false, and your application will enter the lightweight locking state by default.

Lightweight Locking

Lightweight locks are not designed to replace heavy locks, but to reduce the performance loss of traditional heavy locks using operating system mutex without multi-threaded competition.

Lightweight lock acquisition

  1. If the Lock status of the synchronization object is lockless (the Lock flag bit is “01”, whether the bias Lock is “0”), the VIRTUAL machine will first create a space named Lock Record in the stack frame of the current thread, which is used to store the current copy of the Lock object Mark Word. The official product is the taliban Mark Word. The state of the thread stack and object header is shown below:

  2. Copy the Mark Word in the copy object header to the Lock Record.

  3. After the copy is successful, the VM uses the CAS operation to try to update the Mark Word of the object to a pointer to the Lock Record, and the owner pointer in the Lock Record to the Object Mark Word.

  4. If successful, the current thread holds the object lock and sets the Mark Word lock bit of the object header to 00 to indicate that the object is in a lightweight lock state and executes the synchronized code block. The state of the thread stack and object header is shown below:

  5. If the update fails, check whether the Mark Word in the object header points to the current thread’s stack frame. If so, the current thread has the lock and executes the synchronized code block directly.

  6. If not, multiple threads compete for the lock. If there is only one waiting thread, spin to try to acquire the lock. When the spin exceeds a certain number, or another thread contests the lock, the lightweight lock expands to the heavyweight lock. The status value of the lock flag changes to “10”. The pointer to the heavyweight lock (mutex) is stored in the Mark Word, and the thread waiting for the lock also enters the blocking state.

Lightweight lock unlock

The lightweight lock is unlocked when the current thread synchronization block completes.

  1. Attempts to replace the current Mark Word with the product copied in the thread by CAS operation.

  2. If yes, the synchronization is complete

  3. If it fails, there is a competition and the lock has ballooned to a heavyweight lock. Releasing the lock wakes up the suspended thread.

Heavyweight lock

Lightweight locks adapt to the situation where threads execute synchronous blocks almost alternately. If the same lock object is accessed at the same time (the first thread holds the lock and the second thread spins more than a certain number of times), lightweight locks will expand to heavyweight locks, and the Mark Word lock bit is updated to 10. Mark Word points to mutex (heavyweight locks).

Heavyweight locking is implemented through an internal object called a monitor Lock, which is essentially a Mutex Lock that depends on the underlying operating system. Switching between threads requires operating systems to switch from user mode to core state, which is expensive and takes a relatively long time to do, which is why synchronized heavyweight locking was inefficient before JDK 1.6.

Below is the Mark Word data conversion object header between biased locks, lightweight locks and heavyweight locks:

Switch between bias lock, lightweight lock and heavyweight lock

There is a relatively complete lock upgrade process online:

Lock upgrade process

Lock Elimination

Lock elimination means that the virtual machine just-in-time compiler will remove the lock for some synchronized code if it detects that there is no possibility of a shared data race. That is, the just-in-time compiler removes unnecessary locking operations as appropriate.

Lock elimination is based on escape analysis. Simply put, escape analysis is the dynamic scope of the analyzed object. There are three cases:

  • Do not escape: Objects are scoped only in this thread and method

  • Method escape: An object defined within a method is referenced by an external method

  • Thread escape: An object defined within a method is referenced by an external thread

The just-in-time compiler optimizes for different cases of the object:

  • Stack Allocations (Stack Allocations, HotSpot does not support) : Objects are created directly on the Stack.

  • Scalar Replacement: The objects are broken up and the member variables used by the method are created directly. This is provided that the object does not escape the scope of the method.

  • Synchronization Elimination: Lock Elimination, provided that the object does not escape the thread.

In the case of lock elimination, in escape analysis, lock objects that do not escape the thread can be directly removed from the synchronization lock.

Look at an example in code:

public void elimination1() { final Object lock = new Object(); Synchronized (lock) {system.out.println (" Lock object does not only scope the local thread, so the lock is cleared." ); } } public String elimination2() { final StringBuffer sb = new StringBuffer(); sb.append("Hello, ").append("World!" ); return sb.toString(); } public StringBuffer notElimination() { final StringBuffer sb = new StringBuffer(); sb.append("Hello, ").append("World!" ); return sb; }Copy the code

The lock object in Elimination1 () is only scoped inside the method and does not escape from the thread. The SB in Elimination2 () is just that, so the synchronization lock in both methods is eliminated. However, the sb in notElimination() method is the return value of the method, which may be modified by other methods or threads. Therefore, this method alone does not eliminate the lock, but depends on the calling method.

Lock Coarsening

In principle, when writing code, we should keep the scope of the synchronized block scope as small as possible. The number of operations that need to be synchronized is minimized. In case of lock contention, wait for the thread to acquire the lock as soon as possible. But sometimes, if a series of consecutive operations repeatedly lock and unlock the same object, even if the locking operations occur in the body of the loop, frequent mutex synchronization can lead to unnecessary performance losses, even if there is no thread contention. If the virtual machine detects a string of fragmented operations that lock the same object, the scope of lock synchronization will be extended (coarsed) outside the entire operation sequence.

For example, in the elimination2() method in the above example, the append of the StringBuffer is synchronous, and the lock is coarsed for frequent operations, resulting in something like (but similar, not true) :

public String elimination2() {
    final StringBuilder sb = new StringBuilder();
    synchronized (sb) {
        sb.append("Hello, ").append("World!");
        return sb.toString();
    }
}
Copy the code

or

public synchronized String elimination3() { final StringBuilder sb = new StringBuilder(); sb.append("Hello, ").append("World!" ); return sb.toString(); }Copy the code

At the end of the article to summarize

  1. Two things affect performance in synchronous operations:
    1. The lock unlocking process requires additional operations
    2. The conversion cost between user mode and kernel mode is relatively large
  2. Synchronized has a number of optimizations in JDK 1.6: hierarchical locking (biased locking, lightweight locking, heavyweight locking), lock elimination, lock coarsening, etc.
  3. Synchronized multiplexes the Mark Word state bit of the object header to achieve different levels of lock.

Hello, I’m looking at the mountains. Swim in the code, play to enjoy life. If this article is helpful to you, please like, bookmark, follow. Welcome to follow the public account “Mountain Hut”, discover a different world.